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207 Human Secreted Proteins 

Field of the Invention 

This invention relates to newly identified polynucleotides and the polypeptides 
encoded by these polynucleotides, uses of such polynucleotides and polypeptides, and 
5 their production. 

Background of the Invention 

Unlike bacterium, which exist as a single compartment surrounded by a 
membrane, human cells and other eucaryotes are subdivided by membranes into many 
functionally distinct compartments. Each membrane-bounded compartment, or 

0 organelle, contains different proteins essential for the function of the organelle. The cell 
uses "sorting signals/' which are amino acid motifs located within the protein, to target 
proteins to particular cellular organelles. 

One type of sorting signal, called a signal sequence, a signal peptide, or a leader 
sequence, directs a class of proteins to an organelle called the endoplasmic reticulum 

5 (ER). The ER separates the membrane-bounded proteins from all other types of 

proteins. Once localized to the ER, both groups of proteins can be further directed to 
another organelle called the Golgi apparatus. Here, the Golgi distributes the proteins to 
vesicles, including secretory vesicles, the cell membrane, lysosomes, and the other 
organelles. 

0 Proteins targeted to the ER by a signal sequence can be released into the 

extracellular space as a secreted protein. For example, vesicles containing secreted 
proteins can fuse with the cell membrane and release their contents into the extracellular 
space - a process called exocytosis. Exocytosis can occur constitutively or after receipt 
of a triggering signal. In the latter case, the proteins are stored in secretory vesicles (or 

5 secretory granules) until exocytosis is triggered. Similarly, proteins residing on the cell 
membrane can also be secreted into the extracellular space by proteolytic cleavage of a 
"linker" holding the protein to the membrane. 

Despite the great progress made in recent years, only a small number of genes 
encoding human secreted proteins have been identified. These secreted proteins include 

0 the commercially valuable human insulin, interferon. Factor VIII, human growth 

hormone, tissue plasminogen activator, and erythropoeitin. Thus, in light of the 

pervasive role of secreted proteins in human physiology, a need exists for identifying 
.ii. . 
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Summary of the Invention 

The present invention relates to novel polynucleotides and the encoded 
polypeptides. Moreover, the present invention relates to vectors, host cells, antibodies, 
and recombinant methods for producing the polypeptides and polynucleotides. Also 
provided are diagnostic methods for detecting disorders related to the polypeptides, and 
therapeutic methods for treating such disorders. The invention further relates to 
screening methods for identifying binding partners of the polypeptides. 



Detailed Description 
Definitions 

The following definitions are provided to facilitate understanding of certain 
terms used throughout this specification. 

In the present invention, "isolated" refers to material removed from its original 

15 environment (e.g.. the natural environment if it is naturally occurring), and thus is 
altered "by the hand of man" from its natural state. For example, an isolated 
polynucleotide could be part of a vector or a composition of matter, or could be 
contained within a cell, and still be "isolated" because that vector, composition of 
matter, or particular cell is not the original environment of the polynucleotide. 

20 In lnc present invention, a "secreted" protein refers to those proteins capable of 

being directed to the ER, secretory vesicles, or the extracellular space as a result of a 
signal sequence, as well as those proteins released into the extracellular space without 
necessarily containing a signal sequence. If the secreted protein is released into the 
extracellular space, the secreted protein can undergo extracellular processing to produce 

25 a "mature" protein. Release into the extracellular space can occur by many 
mechanisms, including exocytosis and proteolytic cleavage. 

As used herein , a "polynucleotide" refers to a molecule having a nucleic acid 
sequence contained in SEQ ID NO:X or the cDNA contained within the clone deposited 
with the ATCC. For example, the polynucleotide can contain the nucleotide sequence 

30 of the full length cDNA sequence, including the 5' and 3' untranslated sequences, the 
coding region, with or without the signal sequence, the secreted protein coding region, 
as well as fragments, epitopes, domains, and variants of the nucleic acid sequence. 
Moreover, as used herein, a "polypeptide" refers to a molecule having the translated 
amino acid sequence generated from the polynucleotide as broadly defined. 

35 ln the present invention, the full length sequence identified as SEQ ID NO:X 

was often generated by overlapping sequences contained in multiple clones (contig 
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analysis). A representative clone containing all or most of the sequence for SEQ ID 
NO:X was deposited with the American Type Culture Collection ("ATCC"). As 
shown in Table 1, each clone is identified by a cDNA Clone ID (Identifier) and the 
A I CC Deposit Number. The ATCC is located at 10801 University Boulevard, 
Manassas, Virginia 201 10-2209, USA. The ATCC deposit was made pursuant to the 
terms of the Budapest Treaty on the international recognition of the deposit of 
microorganisms for purposes of patent procedure. 

A "polynucleotide" of the present invention also includes those polynucleotides 
capable of hybridizing, under stringent hybridization conditions, to sequences contained 
in SEQ ID NO:X, the complement thereof, or the cDNA within the clone deposited with 

the ATCC. "Stringent hybridization conditions" refers to an overnight incubation at 42° 

C in a solution comprising 50% formamide, 5x SSC (750 in VI NaCl, 75 mM sodium 
citrate), 50 mM sodium phosphate (pH 7.6), 5x Denhardt's solution, 10% dcxtran 
sulfate, and 20 jig/ml denatured, sheared salmon sperm DNA, followed by washing the 

filters in 0.1 x SSC at about 65 D C. 

Also contemplated are nucleic acid molecules that hybridize to the 
polynucleotides of the present invention at lower stringency hybridization conditions. 
Changes in the stringency of hybridization and signal detection are primarily 
accomplished through the manipulation of formamide concentration (lower percentages 
of formamide result in lowered stringency), salt conditions, or temperature, for 

example, lower stringency conditions include an overnight incubation at 37°C in a 

solution comprising 6X SSPE (20X SSPE = 3M NaCl; 0.2M NaH : P0 4 ; 0.02M EDTA, 
pH 7.4), 0.5% SDS, 30% formamide, 100 ug/ml salmon sperm blocking DNA; 

followed by washes at 50°C with 1XSSPE, 0.1% SDS. In addition, to achieve even 
lower stringency, washes performed following stringent hybridization can be done at 
higher salt concentrations (e.g. 5X SSC). 

Note that variations in the above conditions may be accomplished through the 
inclusion and/or substitution of alternate blocking reagents used to suppress 
background in hybridization experiments. Typical blocking reagents include 
Denhardt's reagent, BLOTTO, heparin, denatured salmon sperm DNA, and 
commercially available proprietary formulations. The inclusion of specific blocking 
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complementary stretch of T (or U) residues, would not be included in the definition of 
"polynucleotide," since such a polynucleotide would hybridize to any nucleic acid 
molecule containing a poly (A) stretch or the complement thereof (e.g., practically any 
double-stranded cDNA clone). 
5 The polynucleotide of the present invention can be composed of any 

polyribonucleotide or polydeoxribonucieotide, which may be unmodified RNA or DNA 
or modified RNA or DNA. For example, polynucleotides can be composed of single- 
and double-stranded DNA, DNA that is a mixture of single- and double-stranded 
regions, single- and double-stranded RNA, and RNA that is mixture of single- and 

10 double-stranded regions, hybrid molecules comprising DNA and RNA that may be 

single-stranded or, more typically, double-stranded or a mixture of single- and double- 
stranded regions. In addition, the polynucleotide can be composed of triple-stranded 
regions comprising RNA or DNA or both RNA and DNA. A polynucleotide may also 
contain one or more modified bases or DNA or RNA backbones modified for stability 

15 or for other reasons. "Modified" bases include, for example, tntylated bases and 

unusual bases such as inosine. A variety of modifications can be made to DNA and 
RNA; thus, "polynucleotide" embraces chemically, enzymatically, or metabolically 
modified forms. 

The polypeptide of the present invention can be composed of amino acids joined 

20 to each other by peptide bonds or modified peptide bonds, i.e., peptide isosteres, and 
may contain amino acids other than the 20 gene-encoded ammo acids. The 
polypeptides may be modified by either natural processes, such as posttranslational 
processing, or by chemical modification techniques which are well known in the art. 
Such modifications are well described in basic texts and in more detailed monographs, 

25 as well as in a voluminous research literature. Modifications can occur anywhere in a 
polypeptide, including the peptide backbone, the amino acid side-chains and the amino 
or carboxyl termini. It will be appreciated that the same type of modification may be 
present in the same or varying degrees at several sites in a given polypeptide. Also, a 
given polypeptide may contain many types of modifications. Polypeptides may be 

30 branched , for example, as a result of ubiquitination, and they may be cyclic, with or 
without branching. Cyclic, branched, and branched cyclic polypeptides may result 
from posttranslation natural processes or may be made by synthetic methods. 
Modifications include acetylation, acylation, ADP-nbosylation, amidation, covalent 
attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a 

35 nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, 

covalent attachment of phosphotidylinositol, cross-linking, cyclization. disulfide bond 
formation, demethylation, formation of covalent cross-links, formation of cysteine. 
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formation of pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI 
anchor formation, hydroxylation, iodination, methylation, myristoylaiion, oxidation, 
pegylation, proteolytic processing, phosphorylation, prenylation, racemization, 
selenoylation, sulfation, transfer-RN A mediated addition of amino acids to proteins 
5 such as arginylation, and ubiquitmation. (See, for instance, PROTEINS - 

STRUCTURE AND MOLECULAR PROPERTIES, 2nd Ed., T. E. Creigbton, W 
H. Freeman and Company, New York (1993); POSTTRANSLATIONAL 
COVALENT MODIFICATION OF PROTEINS, B. C. Johnson, Ed., Academic 
Press, New York, pgs. 1-12 (1983); Seifter et al., Meth Enzymol 182:626-646 (1990); 
10 Rattan et al., Ann NY Acad Sci 663:48-62 (1992).) 

"SEQ ID NO:X" refers to a polynucleotide sequence while "SEQ ID NO:Y" 
refers to a polypeptide sequence, both sequences identified by an integer specified in 
i auie i . 

"A polypeptide having biological activity" refers to polypeptides exhibiting 
15 activity similar, but not necessarily identical to, an activity of a polypeptide of the 

present invention, including mature forms, as measured in a particular biological assay, 
with or without dose dependency. In the case where dose dependency does exist, it 
need not be identical to that of the polypeptide, but rather substantially similar to the 
dose-dependence in a given activity as compared to the polypeptide of the present 
20 invention (i.e., the candidate polypeptide will exhibit greater activity or not more than 
about 25-fold less and, preferably, not more than about tenfold less activity, and most 
preferably, not more than about three-fold less activity relative to the polypeptide of the 
present invention.) 

25 Polynucleotides and Polypeptides of the Invention 

FEATURES OF PROTEIN ENCODED BY GENE NO: 1 

This gene is expressed primarily in melanocytes and, to a lesser extent, in 
testes, ovary, kidney and other tissues. 
30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissuc(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer, disorders of neural crest derived cells including pigmentation 

a number ot disorders of the above tissues or cells, particularlv of the skin. 
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reproductive, and renal systems, expression of this gene at significantly hi £her or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
5 the standard gene expression level, i.e.. the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating disorders that arise from alterations in 
the number or fate of neural crest derived cells including cancers such as melanoma and 
1 0 defects of the developing reproductive system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 2 

This gene is expressed primarily in infant brain and fetal lung. 
Therefore, polynucleotides and polypeptides of the invention are useful as 

15 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental disorders of the brain or lung. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

20 of the above tissues or cells, particularly of the central nervous and pulmonary systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 

25 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating or diagnosing disorders associated 
with abnormal proliferation of cells in the Central nervous system and developing June. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 3 

This gene is expressed primarily in breast lymph node and to a lesser extent in 
ovarian cancer and chondrosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
35 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune responses such as inflammation or immune surveillance for 
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tumors. This gene may be important for inflammatory responses associated with 
tumors. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 236 as residues: Lys-45 to Val-50, Lys-69 to Arg-76. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful tor treatment or diagnosis of immune responses 
including those associated with tumor-induced inflammation. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 4 

This gene is expressed primarily in T-cells and T-cell lymphomas. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunilogical diseases involving T-cells such as inflammation, 
autoimmunity, and cancers including T-cell lymphomas. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of T-cells and other cells of the immune 
system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosing and treating T-ce!l based disorders 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 5 

This gene is expressed primarily in activated monocytes. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, hut are 
not limited to, inflammation, autoimmunity, infection, or disorders involving activation 
of monocytes. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
15 individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 238 as residues: Asp- 19 to Arg-31. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosing or treating diseases that result in 
activation of monocytes including infections, inflammatory responses or autoimmune 
20 diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 6 

The translation product of this gene shares sequence homology with terminal 
deoxynucleotidyltransferase which is thought to be important in catalyzing the 

25 elongation of oligo- or polydeoxynucleotide chains. 

This gene is expressed primarily in activated human neutrophils. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

30 not limited to, cancer, particularly those of the blood such as leukemia and deficiencies 
in neutrophils such as neutropenia. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the cardiovascular system, expression of this gene at 

35 significantly higher or lower levels may be routinely detected in certain tissues (e.«., 

cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
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such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to terminal deoxynucleotidyltransferase 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
treatment and differential diagnosis of acute leukemia's. Alternatively, this gene may 
function in the proliferation of neutrophils and be useful as a treatment for neutropenia, 
for example, following neutropenia as a result of chemotherapy. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 7 

The contig exhibits a reasonable homology to the human chorionic gonadotropic 
(HCG) analogue-GT beta-subunit as disclosed in U.S. Patent No. 5,508,261 and PCT 
Publication No. WO 92/22568. There is a high degree of conservation of the 
structurally iuiponum cysteine residues in these identities. 

This gene is expressed primarily in IL-1 and LPS induced neutrophils. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to. diseases of the immune system, including inflammatory diseases and 
allergies. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
25 cell sample taken from an individual having such a disorder, relative to the standard 

gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment/diagnosis of diseases of the immune 
system since expression is primarily in neutrophils, and may be useful as a growth 
factor for the differentiation or proliferation of neutrophils for the treatment of 
neutropenia following chemotherapy. 

FFMt RFS O? PIMVI FIV } \( nm n v.\ / * vr w, 
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biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the immune system, including inflammatory diseases and 
allergies. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 241 as residues: Ser-14 to Pro-22, Leu-43 to Val-53. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of diseases of the 
15 immune system since expression is primarily in neutrophils, and may be useful as a 
growth factor for the differentiation or proliferation of neutrophils for the treatment of 
neutropenia following chemotherapy . 

FEATURES OF PROTEIN ENCODED BY GENE NO: 9 

20 Thls gene is expressed primarily in IL- 1 and LPS induced neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the immune system, including inflammatory diseases and 

25 allergies. Similarly, polypeptides and antibodies directed to these polypeptides are 

useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 

30 fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 242 as residues: Tyr-22 to His-35. 

35 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment/diagnosis of diseases of the immune 
system since expression is primarily in neutrophils, and may be useful as a growth 



BNSDOCID - WO <-.854963A2 



WO 98/54963 



PCT/US98/11422 



1 1 

factor for the differentiation or proliferation of neutrophils for the treatment of 
neutropenia following chemotherapy. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 10 

This gene is expressed primarily in activated T-cells and to a lesser extent in 
endothelial cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents tor differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune dysfunctions including cancer of the T lymphocytes and 
autoimmune disorders and inflammation. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differentia! idcniiiicaiiuii oi die iissue(s) or ceil type(s). hor a number of disorders of 
the above tissues or cells, particularly of the immune system, expression of this gene at 
signilicantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of immune disorders 
particularly of T-cell origin and may act as a growth factor for particular subsets of T- 
cells such as CD4 positive cells which would make this a useful therapeutic for the 
treatment of HIV and other immune compromising illnesses. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 11 

This gene is expressed primarily in fetal tissue. 

Theref ore, polynucleotides and polypeptides of the invention are useful as 
reagents for dif ferential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of many developmental abnormalities. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the developing fetus, 

an individual having such a disorder, relative to the standard gene expression level, i.e.. 
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the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a growth factor or differentiation factor for 
5 particular cell types in the developing fetus and may be useful in replacement or other 
types of therapy in cases where the gene is expressed aberrantly. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 12 

This gene is expressed primarily in T-ceils and to a lesser extent in tumor tissue 

10 including glioblastoma, meningioma, and Wilms tumor. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the immune system including autoimmune conditions such as 

15 rheumatoid arthritis, inflammatory' disorders and cancer. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

20 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 245 as residues: 

25 Thr-9 toSer-14. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis/ modulation of immune function 
disorders, including rheumatoid arthritis and inflammatory responses. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 13 

This gene is expressed primarily in placenta and to a lesser extent in fetal liver 
and bone marrow. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of hematological disorders. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
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disorders of the above tissues or cells, particularly of the hematological and immune 
systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spina! fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a growth factor for heniatapoietic stem cells or 
progenitor cells in the treatment of chemotherapy patients or kidney disease. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 14 

This gene ]S expressed primarily in stmumi ceiis. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of hematapoictic disorders including cancer, 
neutropenia, anemia, and thrombocytopenia. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the heniatapoietic and immune, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a growth factor for hematapoictic stem cells or 
progenitor cells, in particular following chemotherapy treatment. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 15 

The translation product of this gene shares sequence homology with epsilon- 
COP from Bos taurus which is thought to be important as a component of coatomer, a 

complex of seven proteins Mrtt j y - .(}>.■ m - M <^ ■ ■ ^ ■ * ' > 
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VFLYRAYLAQRKFGWLDEIKPSSAPELQAVRMFADYLAHESRRDSIVAELDRE 
MSRSXDVTNTTFLLMAASIYLHDQNPDAALRALHQGDSLECTAMTVQILLKLD 
RLDLARKELKRMQDLDEDA FLTQLATAWVSLA1 GGEKI.QDA Y YIFQEMADKCS 
PTLLLLNGQAACHMAQGRWEAAEGLLQEALDKDSGYPETLVNLIVLSQHLGKP 
5 PHVTNRYLSQLKDAHRSHPFIKEYQAKENDFDRLVLQYAPSAEAGPELSGP 

(SEQ ID NO:458); or RDVERDVFLYRAYEAQRKFGVVLDEIKPSSAPELQAVRMF 
ADYLAHESRRDSIVAELDREMSRSXDVTNTTI^LMAASIYIJIDQNPDAALRALH 
QGDSLECTAMTVQILLKLDREDLARKELKRMQDLDEDATLTQLATAWVSLATG 
GEKLQDAYYIJQEMADKCSPTLLLLNGQAACHMAQGRWEAAEGLLQEALDKD 

10 SGYF>ETLVNLIVESQHLGKPPEVTNRYLSQLKDAHRSFIPFIKEYQAKENDFDRL 

VLQYAPSA (SEQ ID NO:459). 

This gene is expressed primarily in activated monocytes and T-cells, and to a 
lesser extent in multiple other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

15 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunomodulation, specifically relating to transport problems in these 
cells. Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 

20 type(s ). For a number of disorders of the above tissues or cells, particularly of the 

immune, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 

25 expression level, i.e.. the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to epsiIon-COP indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for treating 
/diagnosing problems with the cellular transport of proteins that may result in 

30 immunologic dysfunction. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 16 

The translation product of this gene shares sequence homology with an RNA 
helicase which is thought to be important in polynucleotide metabolism. The translation 
35 product of this contig exhibits good homology to the LbeIF4A antigen of Leishmania 
brazihensis. The LbeIF4A antigen, or immunogenic portions of it, can be used to 
induce protective immunity against leishmaniasis, specifically L. donovam. L. chagasi. 
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L. infantum. L major, L. braziliensis, L. panamensis. L. tropica and L. guyanensis. It 
can also be used diagnostieally to detect Leishmania infection or to stimulate a cellular 
and/or humoral immune response or to stimulate the production of interleukin- 12. 

This gene is expressed primarily in colon cancer and to a lesser extent in 
pituitary. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of cancers particularly of the colon. Similarly, 
polypeptides and antibodies directed to these polypeptides are usef ul in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the gastrointestinal 
system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodilv fluids (e ^ 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
15 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 249 as residues: Glu-93 to Ala-98, Gin- 150 to Leu- 
156. Leu-220 to Leu 231, Leu-268 to Arg-273, Val-324 to Pro-341, Arg-372 to Asn- 
20 380, Ser-405 to Gly-410, Phe-426 to Ala-433, Glu-458 to Asp-470, Arg-506 to Ser- 
547. 

The tissue distribution and homology to RNA helicase indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for development 
of diagnostic tests for colon cancer. 



25 



30 



FEATURES OF PROTEIN ENCODED BY GENE NO: 17 

The translation product of this contig has sequence homology to a cytoplasmic 
protein that binds specifically to JNK designated the INK interacting protein- 1 or JIP 
in mice JIP- 1 caused cytoplasmic retention of JNK and inhibition of JNK-regulated 
gene expression. 

This gene is expressed primarily in brain including pituitary cerebellum frontal 
cortex, fetal brain and to a lesser extent in the kidney cortex. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

reagent*, for diff"r-'n' ; - 1 ; t-nti'" ..; ., f - .] ,. m 
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probes for differentia] identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the centra] nerv ous system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
5 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Furthermore, the translation product of this contig may suppress the effects of 
the JNK signaling pathway on cellular proliferation, including transformation by the 
10 Bcr-Abl oncogene. Preferred epitopes include those comprising a sequence shown in 
SEQ ID NO: 250 as residues: Pro-6 to Ser-26, Ala-30 to Asp-41 , Gly-55 to Ser-61, 
Gly-74 to Tbr-80, Tyr-117 to Ala- 123, Tyr-167 to Asp- 172, Ala-212 to Cys-223, Pro- 
239 to Tyr-244. 

The tissue distribution indicates that polynucleotides and polypeptides 
15 corresponding to this gene are useful for enhanced survival and/or differentiation of 
neurons as a treatment for neurodegenerative disease. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 18 

The translation product of this gene shares sequence homology with a liver 

20 stage antigen from a protozoan parasite. 

This gene is expressed primarily in fetal tissue and to a lesser extent in activated 
T-cells and other immune cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities and diseases of immune function. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 

30 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

35 disorder. 
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The tissue distribution and homology to a protozoan antigen indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for 
treatment/immune modulation of parasitic infections. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 19 

Preferred polypeptide encoded by this gene comprise the following polypeptide 
sequences: 

MKAIGIEPSLATYHHIIRLFDQPGDPLKRSSFIIYDIMNELMGKRFSPKD 

PDDDKFFQSAMSICSSLRDLELAYQVHGLLKTGDNWKFIGPDQHRNFYYSKFF 
DLICLMEQIDVTLKWYEDLIPSAYFPHSQTMIHLLQALDVANRLEVIPKIWER 
(SEQ ID NO:460); and/or KDSKEYGHTFRSDLREEILMLMARDKHPPELQVAF 
ADCAADIKSAYESQPIRQTAQDWPATSLNCIAILFLRAGRTQEAWKMLGLFRKH 
. .wL.^M^^Yiujnn. v onjr^mc v vtLAiAt SLnCfcCJLTQR VMSDFAINQ 
EQKEALSNLTALTSDSDTDSSSDSDSDTSEGK (SEQ ID NO:461 ). Polynucleotides 
encoding such polypeptides are also provided. 

This gene is expressed primarily in stromal and CD34 depleted bone marrow 
cells and to a lesser extent in tissues of embryonic origin. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of hematologic origin including cancers and immune 
dysfunction. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the hematapoietic and immune, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 252 as residues: Scr-28 to Gln-34. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a growth factor for hematopoietic stem cells or 



WO 98/54963 



PCT/l S98/11422 



IS 

FEATURES OF PROTEIN ENCODED BY GENE NO: 20 

Preferred polypeptide fragments can be found in an alternative open reading 
frame. These preferred polypeptides comprise the amino acid sequence: 
MSSDNESDIEDEDLKLELRRLRDKHLKEIQDLQSRQKHEIESLYTKLGKVPPAVI 
5 IPPAAPLSGRRRRPTKSKGSKSSRSSSLGNKSPQESGNLSGQSAASVLHPQQTL 
HPPGNIPESGQNQELQPLKPSPSSDNLYSAFTSDGAISVPSLSAPGQGTSSTNTV 
GATVNSQAAQAQPPAMTSSRKGTFTDDLHKLVDNWARDAMNLSGRRGSKGH 
MNYEGPGMARKFSAPGQLCISMTSNLGGSAPISAASATSEGHFTKSMCPPQQY 
GFPATPFGAQWSGTGGPAPQPLGQFQPVGTASLQNFNISNLQKSISNPPGSNL 
10 RTT (SEQ ID NO:462); IQDLQSRQKHEIESLYTKLGKVPPAVIIPPAAPLSGRRRR 
PTKSKGSKSSRSSSLGNKSPQLSGNLSGQSAASVLHPQQTLHPPGNIPESGQN 
QLLQPLKPSPSSDNLYSAFrSDGAISVPSLSAPGQGTSST (SEQ ID NO:463); 

TSDGAISVPSLSAPGQGTSSTNTVGATVNSQAAQAQPPAMTSSRKGTFTDDLH 
(SEQ ID NO:464); KGHMNYEGPGMARKFSAPGQLCISMTSNLGGSAPISAAS 

15 ATS EG H FT K (SEQ ID NO:465); QPLKPSPSSDNLYSAFTSDGAISVPSLSAPG 
(SEQ ID NO:466). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. 

This gene is expressed in fetal liver and tissues associated with the CNS. 
Therefore, polynucleotides and polypeptides of the invention are useful as 

20 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, liver and CNS diseases. Similarly, polypeptides and antibodies directed 
to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 

25 tissues or cells, particularly of the liver and CNS, expression of this gene at 

significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

30 in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 253 as residues: 
Gln-26 to Lys-34. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment for liver diseases such 

35 as hepatocellular carcinomas and diseases of the CNS. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 21 

In an alternative reading frame, this gene shows sequence homology to two 
recently cloned genes, karyophenn beta 3 and Ran_GTP binding protein 5. (See 
Accession Nos. gi(2 102696 and gnllPIDIe32873 1 .) The Ran_GTP binding protein is 
5 related to importin-beta, the key mediator of nuclear localization signal (NLS)- 

dependent nuclear transport. Based on homology, it is likely that this gene may activity 
similar to the RAN_GTP binding protein. Preferred polypeptide fragments comprise the 
amino acid sequence: VRVAAAESMXLLLECAXVRGPEYLTQMWUFMCDALIKA 
IGTEPDSDVLSE1MHSFAK (SEQ ID NO:467). Also preferred arc polynucleotide 
10 fragments encoding these polypeptide fragments. 

This gene is expressed in thymus tissue. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents {or differential identification of the tissue(s) or cell typc(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

1 5 not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

20 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

2> corresponding to this gene arc useful for diagnosis and treatment for immune disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 22 

I his gene is expressed primarily in prostate and osteoclastoma tissues. 
Preferred polypeptide fragments also comprise the ammo acid sequence: 

30 N1E1NNQNCFIVIDLVRTVMENGVEGELIFGAFLPESWEIGVRCSSEPPKALLLIL 
AHSQKRRLDGWSFIRHLRVHYCVSLTIHFS (SEQ ID NO:468). Also preferred are 
polynucleotide sequences encoding this polypeptide fragment. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

prostate. Similarly, polypeptides and antibodies directed to these polypeptides are 
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useful in providing immunological probes for differential identification of the tissue(s) 
or cell typc(s). For a number of disorders of the above tissues or cells, particularly of 
the bone and prostate systems, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
5 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 255 as residues: Met-1 to Ser-1 1. 
10 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for diagnosis and treatment for bone and prostate 
disorders, especially cancers of those systems. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 23 

15 This gene shares sequence homology with the FK506-binding protein (FKBP- 

13) family, a known cytosolic receptor for the immunosuppressants. Recently, another 
group has cloned a very similar gene, recognizing the homology to FK506-binding 
protein family, calling their gene FKBP23. (See Accession No. 2827255.) 
This gene is expressed primarily in lymphoid tissues. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample, especially for those susceptible to immune suppressant therapies and 
for diagnosis of diseases and conditions, which include, but are not limited to, immune 
suppressant disorders. Similarly, polypeptides and antibodies directed to these 

25 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

30 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 256 as residues: Ala- 19 to Val-31, Arg- 
38 to Gly-49, Ala-6 1 to Lys-66, Tyr-68 to Pro-78, Gly- 1 1 6 to Ala- 121, Asp- 1 54 to 

35 Ser-1 62, Glu-173 to Gin- 186, Phe-194 to GIy-203, Pro-207 to Val-212. 
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The tissue distribution and homology to FKBP-I2 and - 13 indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment for immune suppressant disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 24 

This gene is expressed primarily in the brain and in the retina. This gene maps 
to chromosome 8, and therefore can be used in linkage analysis as a marker for 
chromosome 8. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell typc(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological and ocular associated disease states. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the disorders of the central 
nervous system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 257 as residues: Cys-34 to Asp-40. 

1 he tissue distribution in retina indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and/or detection of eye disorders 
including blindness, color blindness, impaired vision, short and long sightedness, 
retinitis pigmentosa, retinitis proliferans, and retinoblastoma. Expression in the brain 
indicates a role in the is useful for the detection/treatment of neurodegenerative disease 
states and behavioral disorders such as Alzheimer's Disease, Parkinson's Disease, 
Huntington's Disease, schizophrenia, mama, dementia, paranoia, obsessive compulsive 
disorder and panic disorder. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 25 

This gene shows sequence homology to a newly identified class of proteins 

exprew^d in the nrrv.uiv , V vf t w V i f . (( j, T . Mtl r,., -.].. ^ 

signaling pathways. Those pathways aitect cell proliferation and differentiation. 
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Preferred polypeptide fragments comprise the amino acid sequence: 
QDKHAEEVRKNKELKEEASR (SEQ ID NO:469); QQDLSPWAAPVGCPLXXASX 

ICHXLPLSGCLRRQSXSLPVVAXLCFWFSCPLASLFVPGQPCVTCPFPSLPFQD 
KHAEEVRKNKELKEEASR (SEQ ID NO:470). Also preferred are the 
5 polynucleotide fragments encoding these polypeptide fragments. 
This gene is expressed primarily in brain- 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, neurological disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the central nervous system, expression of this sene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

15 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual havim* 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

20 corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer^s Disease, Parkinson's 
Disease, Huntintons Disease, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder and panic disorder. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 26 

The polynucleotide sequence of this gene contains a domain similar to a Flt3 
hgand peptide. Preferred polypeptide fragments comprise the amino acid sequence: 
PTRCCTTQPCRSSARRPCWVPMVPSPEGREXQPTCPS (SEQ ID NO:471). Thus, 
this gene may have activity as binding to Flt3 receptors, a process known to promote 
30 angiogenesis and/or lymphangiogenesis. 

This gene is expressed in human tonsil, and to a lesser extent in 
teratocarcinoma, placenta, colon carcinoma, and fetal kidney. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for identification of the tissue(s) or cell type(s) present in a biological sample 
35 and for diagnosis of diseases and conditions, which include, but are not limited to, 
diseases of the tonsil, as well as cancers, such as colon, reproductive, and kidney 
cancers. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
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in providing immunological probes for differential identification of the tissue* s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
tonsils, colon, reproductive organs, and kidneys, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
5 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 259 as residues: 

10 Pro-22toGlu-33. 

The tissue distribution in tonsil and several cancers and fetal tissues indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and treatment of diseases of ihe tonsii or colon, such as tonsillitis, 
inflammatory diseases involving nose and paranasal sinuses, especially during the 

1 5 infection of influenza, adenoviruses, parainfluenza, rhinoviruses. The gene may also be 
useful in the diagnosis and treatment of neoplasms of nasopharynx or colon origins. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 27 

In an alternative reading frame exists a large open reading frame that encodes a 
20 preferred polypeptide. Preferred polypeptide fragments comprise the amino acid 
sequence: 

MKRSLNENSARSTAGCLPVPLFNQKKRNRQPLTSNPLKDDSGISTPSDNYDFP 
PLPTDWAWEAVNPEXAPVMKTVDTGQIPHSVSRPLRSQDSVFNSIQSNTGRSQ 
GGWSYRDGNKNTSLKTWXKNDFKPQCKRTNLVANDGKNSCPMSSGAQQQK 
25 QLRTPEPPNLSRNKETELLRQTHSSKJSGCTMRGLDKNSALQTLKPNFQQNQY 
KXQMLDDIPEDNTLKETSLYQLQFKEKASSLRIISAVIESMKYWREHAQKTVLL 
FEVLAVLDSAVTPGPYYSKrFLMRDGKNTLPC VFYEIDRELPRLIRGRVHRCVG 

NYDQKKNIFQCVSVRPASVSEQKTFQAFVKIADVEMQYYINVMNETlSEQ ID 
NO:472); SQDS VFNSIQSNTGRSQGGWS YRDGNKNTSLKTWX KNDFKPQCKR 
30 (SEQ ID NO:473); NKETELLRQTHSSKISGCTMRGLDKNSALQTLKPNF (SEQ ID 
NO:474):SSLRIISAVIESMKYWREHAQKTVLLFEVLAVLDSAVTPGPYYSKTFLM 
(SEQ ID NO:475); and PRLIRGRVHRCVGNYDQKKNIFQCVSVRPASVSEQKT 
FQAFV(SEQIDNO:476). 

biological sample and for diagnosis, of diseases and conditions, which include, hut arc 
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not limited to, male reproductive disorders, including cancer. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the male reproductive system, 
5 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
10 disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a hormone with reproductive or other systemic 
functions; contraceptive development; male infertility of testicular causes, such as 
Kleinfeltens syndrome, varicocele, orchitis; male sexual dysfunctions; testicular 
15 neoplasms; and inflammatory disorders such as epididymitis. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 28 
This gene is expressed primarily in apoptotic T-cell. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
20 reagents for differentia] identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases relating to T cells, as well as cancer in general. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
25 a number of disorders of the above tissues or cells, particularly of the disorders of the 
immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for immune disorders. Moreover, since the gene 
was isolated from an apoptotic cell and based on the understanding of the relationship 
of apoptosis and cancer, it is likely that this gene may play a role in the genesis of 



30 



35 



cancer. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 29 

This gene is expressed primarily in human tonsils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, gastrointestinal disorders. Similarly, polypeptides and antibodies directed 
to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 

10 tissues or cells, particularly of the gastrointestinal system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.u., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample iaken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

1 5 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of gastrointestinal 
diseases. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 30 

The translation product of this gene shares sequence homology with C44C1.2 
gene product of Caenorhabditis elegans with unknown function. Preferred polypeptide 
fragments comprise the amino acid sequence: 

GVFRPCVCGRPASLTCSPLDPEVGPYCDTPTMRTLFTJLLWLALACSPVHTTLSK 
25 SDAKKAASKTLLEKSQFSDKPVQDRGLVVTDLKAESVVLEHRSYCSAKARDRH 
FAGDVLGYVTPWNSHGYDVTKVFGSKFTQISPVWLQLKRRGREMFEVTGLHD 
VDQGWMRAVRKHAKGLHIVPRLLFED\V'rYDDFRNVLDSEDEIEELSKTVVQVA 
KNOHFDGFVVEVWNQLLSQKRVGLIHMLTI IFAEAITIQARLLALLVIPPAITPGT 
DQLGMFTHKEFFQLAPVLDGFSLMTYDYSTAl IQPGPNAPLSWVRACVQVLDP 
30 KXKWRTKSSWGSTSMXWTXRXPXDARXPVVGXRXIQXLKDHXPRMVLDSK 
PQ (SEQ ID NO:477); TCSPLDPEVGPYCDTP TMRTLFNLLWLALACSPVI ITTLS 
(SEQ ID NO:478); LVVTDLIO\ESVVLEHRSYCSAKARDRHFAGDVLGYVTPW 
NSHGYDVTKVFGSKF (SEQ ID NO:479); RFMFFVTGLl IDVDQGWMRA VRK 
HAKGITHVPRI.I TFD\VTYDDFRN T VT DSFDF fC :ro in vo <vm miw,< - ^ 

US I iSh( x > 11) N():4S2). Also preferred are polynucleotide fragments encoding these 
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polypeptide fragments. This gene maps to human chromosome 1 1, and therefore is 
useful in linkage analysis as a marker for chromosome 1 1 . 

This gene is expressed primarily in human T cells and to a lesser extent in 
human colon carcinoma. 
s Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders and cancer. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 

10 differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the immune and gastrointestinal systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 

15 an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 263 as residues: Leu-2 1 to Ala-30, Ser-38 to Asp-47, Pro-87 to Asp-94, Leu- 197 
to Thr-204, Pro-256 to Ser-262, Thr-277 to Arg-282, Thr-293 to Trp-303. 

20 Thc tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the diagnosis and treatment of immune 
disorders and gastrointestinal diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 31 

25 The translation product of this gene shares sequence homology with Ribosomal 

protein LI 1 of Caenorhabditis elegans. (See Accession No. 156201.) Preferred 
polypeptide fragments comprise the amino acid sequence: 

ERGVSINQFCKEFNERTKDIKEGIPLPTK1LVKPDRTFEIKIGQPTVSYFLKAAAG 
IEKGARQTGKEVAGLVTLKHVYEIARIKAQDEAFALQDVPLSSVVRSIIGSARSL 

30 GIRVVKDLSSEELAAF QKERAIFLAAQKEADLAAQEEAAKK (SEQ ID NO:483). 
Also preferred are polynucleotide fragments encoding these polypeptide fragments. 

This gene is expressed in human embryo tissue and to a lesser extent in human 
epithelioid sarcoma and other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

35 reagents for identification of the tissue(s) or cell type(s) present in a biological sample 
and for diagnosis of diseases and conditions, which include, but are not limited to, 
development disorders and epithelial cell cancer. Similarly, polypeptides and antibodies 
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directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the embryonic and epithelial cell systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
5 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
10 NO: 264 as residues: Lys-34 to Gly-40. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of developmental 
disorders and epithelial cancer. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 32 

This gene is expressed primarily in resting T cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

20 not limited to, inflammatory and general immune disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

25 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

30 corresponding to this gene are useful for the diagnosis and treatment of disorders of 
immune system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 33 

This uene i<< believe! T<> r ( . v j,i,. ,,,, r- 1 - J ■ 

brain, human umbilici! vein endothelial cells, and amniotic cells. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, prostate-related disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the urinary system and nervous system 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for the diagnosis and treatment of disorders of the urinary and nervous systems. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 34 

This gene shares sequence homology with R05G6.4 gene product. (See Accession No. 
gill 326338.) This gene also shares sequence homology with the cyclophihn-Iike protein 
CyP-60. (See Accession No. 1 199598, see also Biochem. J. 314 (1), 313-319 
(1996).) Preferred polypeptide fragments comprise the amino acid sequence: 
AVYTYHEKKKDTAASGYGTQNIRLSRDAVKDFDCCCLSLQPCHDPVVTPDGYL 
YEREAILEYILHQKXEIARQMKAYEKQRGTRREEQKELQRAASQDHVRGFLEKE 
SAIVSRP LNPFTAKALSGTSPDDVQPGPSVGPPSKDKDKVLPSFWIPSLTPEAK 
ATKLEKPSRTVTCPMSGKPLRMSDLTPVHFTPLDSSVDRVGLITRSERYVCAVT 
RDSLSNATPCAVLRPSGAVVTLECVEKLIRKDMVDPVTGDKLTDRDIIVLQRGT 
(SEQ ID NO:484); YLYEREAILEYILHQKKEIARQMKA YEKQRGTRREEQKELQ 
RAASQDHVRGFLE (SEQ ID NO:485); and FTAKALSGTSPDDVQPGPSVGPP 
SKDKDKVLPSFWIPSLTPEAKATKLEKPSRTVTCPMSGKPL (SEQ ID NO:486). 
Also preferred are polynucleotide fragments that encode these polypeptide fragments. 
This gene is expressed primarily in human testis and to a lesser extent in other 

tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, male reproductive disorders and in particular testicular cancer. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
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immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system. 
Expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g.. serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken 1 
an individual having such a disorder, relative to the standard gene expression level, i. 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of disorders of the 
male reproductive system and in particular of testicular cancer. 



rom 

e.. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 35 

The translation product of this gene shares sequence homology with Lpe5p of 
1 5 Saccharomyces cerevisiae which is thought to be important in the metabolism of 
phospholipids. 

This gene is expressed primarily in liver and brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, metabolic disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the metabolic and nervous systems expression of this 

25 gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 

30 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

NO: 268 as residues: Pro- 14 to Leu-20, Lys-28 to Asn-38. Arg-109 to Arg-1 14, Lys- 
1 19 to Asn-124, Glu-152 to Leu- 157, Pro- 172 to Val-180. 

The tissue distribution and homology to Lpe5p of Saccharomyces cerevisiae 
indicates that pnlvnui-Jrotid'^ 1 - ■ ' * - 1 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 36 

This gene shares sequence homology with the nuclear ribonucleoprotein U (HNRNP 
U), encoded by C elegans (See Accession gill 703576/) Preferred polypeptide 
fragments comprise the amino acid sequence: 

5 MDTSENRPENDVPEPPMPIADQVSNDDRPEGSVEDEEKKESSLPKSFKRKISVV 
SATKGVPAGNSDTEGGQPGRKRRWGASTATTQKKPSISITTESLKSLIPDIKPL 
AGQEAVVDLHADDSR1SEDETERNGDDGTHDKGEKICRTVTQVVPAEGQENGQ 
REEEEEEKEPEAEPPVPPQVSVEVALPPPAEHEVKKVTLGDTLTRRSISQQKSGV 
SITIDDPVRTAQVPSPPRGKISNIVHISNLVRPFTLGQLKELLGRTGTLVEEAFWI 

10 DKJKSHCFVTYSTVEEAVATRTALHGVKWPQSNPKFLCADYAEQDELDYHRGL 
LVDRPSETKTEEQGIPRPLHPPPPPPVQPPQHPRAEQREQERAVREQWAERERE 
MERRERTRSEREWDRDKVREGPRSRSRSRXRRRKERAKSKEKKSEKKEKAQE 
EPPAKLLDDLFRKTKAAPCIYWLPLTDSQIVQKEAERAERAKEREKRRKEQEEE 
EQKEREKEAERERNRQLEREKRREHSRERDRERERERERDRGDRDRDRERDRE 

15 RGRERDRRDTKRHSRSRSRSTPVRDRGGR (SEQ ID NO:488>. Also preferred are 
the polynucleotide fragments encoding this polypeptide fragments. 
This gene is expressed primarily in epididymus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the male reproductive system. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the male reproductive system, expression of 

25 this gene at significantly higher or lower levels may be routinely detected in certain 

tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

30 disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of male 
reproductive disorders. 

35 FEATURES OF PROTEIN ENCODED BY GENE NO: 37 

This gene is expressed primarily in amygdala. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammatory diseases and reproductive disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the amygdala, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.^., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of inflammatory' 
diseases and reproductive disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 38 

This gene shares sequence homology with human opsonin protein P35 
fragment. (See Accession No. R94181.) The opsonin protein activates the phagocytosis 
of pathogenic microbes by phagocytic cells. Preferred polypeptide fragments comprise 
the amino acid sequence: GCDSCPPHLPREAFAQDTQAEGECSSRAERADMCPDAP 
PSQEVPEGPGAAP (SEQ ID NO:489). Also preferred are polynucleotide fragments 
encoding these polypeptide fragments. 

This gene is expressed in immune-related tissues such as thymus, macrophage, 
T cells and to a lesser extent in many other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders and infectious disease. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune system and infectious disease, 
expression of this gene at significantly higher lorvtv- — ■ > 

■ at mui\ idu.il having Midi a disorder, relative to the standard gene expression level, i.e.. 
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the expression level in healthy tissue or hodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 271 as residues: Lys-9 to Arg-14, Met-38 to Asp-5 1 . 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of immune disorders, 
as well as the treatment and/or diagnosis of infectious disease. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 39 

The translation product of this gene shares sequence homology with alpha-2 
type I collagen which is thought to be important in tissue repair. (See, e.g., 21 1607.) 
Preferred polypeptide fragments comprise the amino acid sequence: PQLPSCGRPW 
PGTASVFQSHTQGPREDPDPCRAQGSAGTHCPISLSPPRQ (SEQ ID NO:490). 
Also preferred are the polynucleotide sequences encoding these polypeptide sequences. 
This gene is expressed primarily in the brain and to a lesser extent in the kidney 
15 and thymus 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, brain, kidney, and immune disorders. Similarly, polypeptides and 

20 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the brain, kidney, and immune disorders, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

25 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to alpha-2 type I collagen indicates that 
30 polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of tissue repair, and brain, kidney, immune disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 40 

The translation product of this gene shares sequence homology with mini- 
35 collagen which is thought to be important in tissue repair tumor metastasis. (See 
Accession No. gnllPIDIdl006976.) Preferred polypeptide fragments comprise the 
amino acid sequence: PGFRGPSGSLGCSFFPRSLGRVLPPGCQRPGAHAD 
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SSPPPTP (SEQ ID NO:49 1 ). Also preferred are polynucleotides encoding this 
polypeptide fragment. 

This gene is expressed in ovarian cancer and to a iesser extent in dcdntic cells 
and smooth muscle. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumor metastasis and tissue repair. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the tumor metastasis and tissue repair, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., canceious and wounded tissues) or bodily fluids (e.g.. scrum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e.. 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 273 as residues: Asn-2 to His-1 1. 

The tissue distribution and homology to mini-collegen gene indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of tumor metastasis and tissue repair. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 41 

This gene shares sequence homology with the HIV TAT protein. (See 
Accession No. 328416.) Preferred polypeptide fragments comprise the amino acid 
sequence: EDLKKPDPASLRAASCGEGKKRKACKNCTCGLAEELEKEK 
SREQMSSQPKSACGNCYLGDAFRCASCPYLGMPAFKPGEKVLLS (SEQ ID 
NO:492):EDLKKPDPASLRAASCGEGKKRKACKNCTCGLAEELEKEK 
SREQMSSQPKSACGNCYLGDAFRCASCPYLGMPAFKPGEKVLLSDSNLHD 
(SEQ ID NO:493); CGNCYLGDAFRCASCPYLGMPAFKPGEKVLLSDS 
(SEQ ID NO:494); SCGEGKKRKACKNCTCGLAEELEKE (SEQ ID NO:495); 
SQPKSAC GNCYLGDAFRCASC (SEQ ID NO:496); and REAGQNSERQYVS 
LSRD (SEQ ID NO:497). Also preferred are polynucleotide fragments encoding these 

tv-'yo..pi;,!.. f r:n , n ,,, n ,. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differentia] identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, brain, testes and breast disorders. Similarly, polypeptides and antibodies 
5 directed to these polypeptides are useful in providing immunological probes for 

differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the brain, testes and breast disorders, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 274 as residues: Pro-7 to VaI-15. 

15 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for diagnosis and treatment of brain, testes and 
breast, and other related disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 42 

20 This S ene is expressed primarily in the infant brain, human cerebellum, and to a 

lesser extent in medulloblastoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, brain related disorders and medulloblastoma and other brain cancers. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
brain related disorders and brain cancers, including medulloblastoma, expression of this 

30 gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 

35 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 275 as residues: Thr-41 to Glu-47. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of human brain related 
disorders, brain cancers, and medulloblastoma. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 43 

The translation product of this gene shares sequence homology with a 
phosphotyrosine-independent ligand for the lck SH2 domain which is thought to be 
important in signal transduction related to phosphotyrosine-independent ligand for the 
lck SH2 domain. (See Accession No. gill 184951.) Preferred polypeptide fragments 
comprise the amino acid sequence: ESSGQARTLADPGPGWPRQQGMCFGSLT 

GLSTTPHGFLTVSAEADPRLIESLSQMLSMGFSDEGGWLTRLLQTKNYDIGAAL 
DTIQYSKH (SEQ ID NO:498). Also preferred are polynucleotide fr a^ments encoding 
this polypeptide fragment !t is likely that this gene is a new member of a family of 
phosphotyrosine-independent ligands for the lck SI 12 domains. 

This gene is expressed primarily in the placenta and to a lesser extent in 
endothelial cells and neutrophil. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but arc 
not limited to, reproductive, cardiovascular, immune, and infectious diseases. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
cardiovascular, reproductive, and immune system, and infectious diseases, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g.. serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to a phosphotyrosine-independent ligand 
for the lck SH2 domain indicates that polynucleotides and polypeptides corresponding 
to this gene are useful for diagnosis and treatment of cardiovascular, reproductive, and 
immune svstem diseases, as well mfrrt^M- i. - . - 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 44 

This gene is expressed primarily in the fetal brain, cerebellum and to a lesser 
extent in the placenta. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neuronal cell related disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the neuronal cell related disorders, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not havine the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 277 as residues: Thr-20 to Gly-28. 

The tissue distribution and homology to proline-rich protein genes indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of neuronal cell related disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 45 

The translation product of this gene shares sequence homology with 
precerebellin of human, which is thought to be important in synaptic physiology. (See 
Accession No. gill 80251.) It has been observed that cerebellin-like immunoreactivity is 
associated with Purkinje cell postsynaptic structures. Thus, it is likely that this gene 
also have synaptic activity. Preferred polypeptide fragments comprise the amino acid 
sequence: QEGSEPVLLEGECLVVCEPGRAAAGGPGGAALGEAPPGRVAFXAV 
RSHHHEPAGETGNGTSGAIYFDQVLVNEGGGFDRASGSFVAPVRGVYSFRFH 
VVKVYNRQTVQVSLMLNTWPVISAFANDPDVTREAATSSVLLPLDPGDRVSLR 
LRRGXSTGW (SEQ ID NO:499). Also preferred are polynucleotide fragments 
encoding these polypeptide fragments. 

This gene is expressed primarily in cerebellum and infant brain. By Northern 
analysis, a single transcript of 2.4 kb was observed in brain tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to. neuronal cell signal transduction and synaptic physiology. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the neuronal cell 
5 signal transduction and synaptic physiology expression of this gene at significantly 

higher or lower levels may be routinely detected in certain tissues (e.g.. cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
10 or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to gene or gene family indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of neuronal cell related disorders. 

IS FEATURES OF PROTEIN ENCODED BY GENE NO: 46 

This gene is expressed in fetal liver and spleen, and to a lesser extent in bone 
marrow, umbilical vein, and T cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to. disorders of the immune system, particularly hematopoiesis. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the hematopoiesis 

25 and immune disorders, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., scrum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 

M) fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 279 as residues: Asp-3() to Glu-57. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of hematopieotic and 
immune disorders 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 47 

The translation product of this gene shares sequence homology with a 12 kD 
nucleic acid binding protein of Feline calcivirus which is thought to be important in viral 
replication. (See Accession No. 59264) 
5 This gene is expressed primarily in human cardiomyopathy and to a lesser 

extent in T helper cells, fetal brain and synovial sarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
10 not limited to, cardiomyopathy as well as viral infection. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the cardiovascular system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
15 tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
20 NO: 280 as residues: Trp-20 to Cys-26. 

The tissue distribution in cardiomyopathy and homology to viral 12 kD nucleic 
acid binding protein indicates that polynucleotides and polypeptides corresponding to 
this gene are useful for diagnosis and intervention of cardiomyopathy, including those 
caused by ischemic, hypertensive, congenital, valvular, or pericardial abnormalities. 
25 The gene expression pattern may be the consequence or the cause for these conditions. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 48 

The translation product of this gene shares sequence homology with tumor 
necrosis factor related gene product which is thought to be important in tumor necrosis, 
30 bacterial and viral infection, immune diseases and immunoreactions. 

This gene is expressed primarily in colon and to a lesser extent in ovarian and 
breast cancers. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumors of colon, ovary or breast origins. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
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for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the colon, ovary and breast, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to Tumor necrosis factors indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for intervention 
of cancers of colon, ovary and breast origins, because TNF family members are known 
to be involved in the tumor development. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 49 

The translation product of this gene shares sequence homology with mucins, 
such as epithelial mucin, which is thought to be important in extracellular matrix 
functions such as protection, lubrication and cell adhesion (See for example Accession 
No. R68002). Preferred polypeptide fragments comprise the following amino acid 
sequence: PRSRPALRPGRQRPPSHSATSGVLRPRKKPDP (SFQ ID NO:500). 
Also preferred are polynucleotide fragments encoding these polypeptide fragments. 
Moreover, this gene maps to chromosome 22ql 1.2-qter, and therefore, can be used as 
a marker in linkage analysis for chromosome 22. 

This gene is expressed primarily in corpus colosum. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type( s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumors, especially of corpus colosum, as well as metastatic lesions. 
Similarly, poly peptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissuefs) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
corpus colosum and other solid tissues, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g.. cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell vtnip'c * d » r - ^ • • i; : i . ,) 
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The tissue distribution and homology to mucins indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for serum tumor markers or 
immunotherapy targets because tumor cells have greatly elevated level of mucin 
expression and shed the molecules into the epithelial tissues. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 50 

This gene is expressed primarily in CD34 depleted buffy coat cord blood and 
primary dendritic cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematopoietic disorders and immunological disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 

15 a number of disorders of the above tissues or cells, particularly of the hematopoietic and 
immune systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 

20 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution in CD34 depleted buffy coat cord blood and primary 
dendritic cells indicates that polynucleotides and polypeptides corresponding to this 
gene are useful for diagnosis and treatment of hematopoietic and immune disorders. 

25 Secreted or cell surface proteins in the above tissue distribution often are involved in 
cell activation (e.g. cytokines) or molecules involved in cell surface activation. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 51 

The translation product of this gene shares sequence homology with Interferon 
30 induced 1-8 gene encoded polypeptide which is thought to be important in binding to 
retroviral rev responsive element. Preferred polypeptide fragment comprise the 
following amino acid sequences: MTLITPSXKLTFXKGNKSWSSRACSSTLVDP 
(SEQ ID NO:50I). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. 
35 Thls g en e is expressed primarily in CD34 positive cells and neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
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biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to. retroviral infection, such as A I OS. and other i mm uric disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
typc(s). For a number of disorders of the above tissues or cells, particularly of the 
immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Pre! erred epitopes include those comprising a 
sequence shown in SEQ ID NO: 284 as residues: Gln-5 1 to Trp-62. 

The tissue distribution and homology to interferon induced gene 1-8 indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
intervention of retroviral infection including HIV. The factor may be involved in viral 
stability or viral entry into the cells. Alternatively, the virus/factor complex may elicit 
the cellular immune reaction. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 52 

This gene shares sequence homology to immunoglobulin lambda chain (See 
Accession No. 2865484). Therefore it is likely that this gene has activity similar to an 
immunoglobulin lambda chain. Preferred polypeptide fragments comprise the following 
amino acid sequence: GHPSPALSIAPSDGSQLPCDEVPYGEAHVTRYCKKPLTNS 
HLETEAQSSSL (SEQ ID NO:502). Also preferred are polynucleotide fragments 
encoding these polypeptide fragments. 

This gene is expressed primarily in Hodgkin's lymphoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differentia) identification of the tissuc(s) or cell typc(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to. Hodgkin's lymphoma and other immune disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this 'vne at M< T ni tV-^! I 1 • hi- 1 . ' 1 
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the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SKQ ID 
NO: 285 as residues: Pro-27 to Thr-32. 

The tissue distribution in Hodgkin's lymphoma and the sequence homology 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis of Hodgkin's lymphoma, since the elevated expression and secretion by the 
tumor mass may be indicative of tumors of this type. Additionally the gene product may 
be used as a target in the immunotherapy of the cancer.Because the gene is expressed in 
cells of lymphoid origin, the natural gene product may be involved in immune 
functions. Therefore it may be also used as an agent for immunological disorders 
including arthritis, asthma, immune deficiency diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 53 

This gene has extensive homology to cDNA for Homo sapiens mRNA for the 
ISLR gene(See Accession No. AB003184). This protein is considered to be a new 
member of the Ig superfamily and contains a leucine-rich repeat (LRR) with conserved 
flanking sequences and a C2-type immunoglobulin (Ig)-Iike domain. These domains are 
important for protein-protein interaction or cell adhesion, and therefore it is possible that 
the novel protein ISLR may also interact with other proteins or cells. The ISLR gene 
was mapped on human chromosome 15q23-q24 by fluorescence in situ hybridization 
(See Medline Article No. 97468140). Homology to the ISLR gene has been confirmed 
by another independent group as well (See Accession No. Hs.102 171 ) 

This gene is expressed in a number of tissues including human retina, heart, 
skeletal muscle, prostate, ovary, small intestine, thyroid, adrenal cortex, testis, 
stomach, spinal cord, fetal lung and fetal kidney tissues, colon, tonsil and stomach 
cancer, and to a lesser extent in endometrial stromal cells treated with estradiol, breast 
tissue, synovium, lymphoma, and number of other tumors. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumors of colon, ovary and breast origins. However, due to the wide 
range of expression in various tissues, protein may play a vital role in the development 
of cancer in other tissues as well, not just those mentioned above. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the colon, ovary and 
breast, expression of this gene at significantly higher or lower levels may be routinely 
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detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g.. 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample " 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Additionally, this gene maps to chromosome 15q23- 
q24, and therefore, can be used as a marker in linkage analysis for chromosome 15. 

The tissue distribution in tumors of colon, ovary, and breast origins indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 54 

Gene has homology to multidrug resistance gene 1 (See Accession No. 
P06795). Preferred polynucleotide fragments comprise the following sequence- 

GCTTCGTGTCCAACCCTCTTGCCCTTCGCCTGTGTGCCTGGAGCCAGTCCCA 
CCACGCTCGCGTTTCCTCCTGTAGTGCTCACAGGTCCCAGCACCGATGGCA 
TTCCCTTTGCCCTGAGTCTGCAGCGGGTCCCTTTTGTGCTTCCTTCCCCTCA 
GGTAGCCTCTCTCCCCCTGGGCCACTCCCGGGGGTGAGGGGGTTACCCCTT 

CCCAGTGTiTTTrATTCCrGTGGGGCTCACCCCAAAGTATTAAAAGTAGCTTT 
GTAA (SEQ ID NO:503). Also preferred are polypeptide fragments encoded by these 
polynucleotide fragments. 

This gene is expressed primarily in lung, esophagus, leukemia (Jurkat cells) and 
breast cancers and to a lesser extent in macrophages treated with GM-CSF fetal tissues 
and wide range of tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not hm.ted to. cancer of wide range of origins. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the solid tumors, lung and leukemia, 
expression <>f •■ UK v , , n : lv , ,. . , . , , , 
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the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Furthermore, due to the high expression level in lung tissue and the proposed 
function of the multidrug resistance protein 1 gene as the efflux pump responsible for 
low-drug accumulation in multidrug-resistant cells, protein as well mutants thereof, 
may also be beneficial as a target for gene therapy, particularly for the chronic patient. 
Preferred epitopes include those comprising a sequence shown in SEQ ID NO: 287 as 
residues: Met-1 to Lys-16. 

The tissue distribution in wide range of cancers and fetal tissues indicates that 
polynucleotides and polypeptides corresponding to this gene arc useful for detection of 
cells in active proliferation, such as cancers. The gene products may be used for cancer 
markers or immunotherapy target. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 55 

This gene maps to the X chromosome. 

1 5 This £ cnc is expressed primarily in the brain and to a lesser extent in the 

developing embryo. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

20 not limited to. neurodegenerative disease states and developmental disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders, including sex-linked disorders, of the above tissues or cells, 
particularly of the neurological, developmental systems, and cardiovascular system, 

25 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

30 disorder. Moreover, this gene maps to the X chromosome, and therefore, may be used 
as a marker in linkage analysis for this chromosome. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 

35 Disease, Huntington's Disease, Klinefelter s, schizophrenia, mania, dementia, 

paranoia, obsessive compulsive disorder and panic disorder. In addition, the gene or 
gene product may also play a role in the treatment and/or detection of developmental 
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disorders associated with the developing embryo, sexually-linked disorders, or 
disorders of the cardiovascular system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 56 

5 The translation product of this gene shares sequence homology with paxillin 

which is thought to be important in mediating signal transduction from growth factor 
receptors to the cytoskeleton. Preferred polynucleotide fragments comprise the 
following sequence: TGGCTCACrGTCTrACAATCACTGCTTGTGCjAATCATGA 
TACCACTITTAGCrCTITGC^^ 

10 AAGTAGATTTrAACTGGACAACTTTG 

GGCTTGTGGTTTCAA (SEQ ID NO:5()6). Also preferred are polypeptide fragments 
encoded by these polynucleotide fragments. More preferably, polypeptide fragments 
comprise the ammo acid sequence: LDELMAi ILTEMQAK V AVRAD 
AGKKilLPDKQDHKASLDSMLGGLEQELQDLGIATVPKGIICASCQKPIAGKVI 

15 HALGQSWHPEUFVCTHCKEEIGSSPFFERSGLXYCPNDYHQLFSPRCAYCAAP 
ILDKVLTAMNQTWHPEHFFCSHCGEVFGAEGFHEKDKKPYCRKDFLAMFSPK 
CGGCNRPVLENYESAMDTVWHPECFVCGDCFTSFSTGSFFELDGRPFCELHYH 
HRRGTLCHGCGQPITGRCISAMGYKFHPEHFVCAFCLTQLSKGIFREQNDKTY 
CQPCFNKLF (SEQ ID NO:507); KASLDSMLGGLEQELQDLGIATVPKGHC 

20 ASCQKPI AGKVIH AL (SEQ ID NO:508); CPNDYI IQLFSPRCA YCAAPILDKVL 
TAMNQTWHPEUFFCSHCGEVFGAEG (SEQ ID NO:509); DKKPYCRKDFLAM 

fspkc:ggcnrpvlenylsamdtvwhpecfvcgdcftsfstgsffeldgrpfce 

E (SEQ ID NO:510); CGQPITGRCISAMGYKFHPEHFVCAFCLTQLSKGIFRE 
QNDKTYCQ (SEQ ID NO:5 1 1 ). Polynucleotide fragments encoding these preferred 

25 polypeptide fragments are also contemplated. 

This gene is expressed primarily in brain, and to a lesser extent in the 
developing embryo. 

fheretore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disease states and developmental abnormalities. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders nC th<> -.K^ . ■ t ; ... . n . . < ■ 
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cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Moreover, since this gene shares homology with a 
gene that maps to chromosome 11, (See Accession No.T87404), gene as well as its 
5 translated product may be used for linkage analysis on chromosome 1 1 . 

The tissue distribution and homology to paxillin indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for the treatment and or detection 
of disease states associated with abnormal signal transduction in brain and/or the 
developing embryo. This would include treatment or detection of neurodegenerative 
10 disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 

Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder and panic disorder and also in the treatment and or detection of 
embryonic development defects. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 57 

This gene is expressed primarily in fetal spleen, brain, and to a lesser extent in 
six week old embryo. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders, neurological disorders, and developmental 
abnormalities. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

25 the immune and developmental systems, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 

30 or bodily fluid from an individual not having the disorder. Preferred epitopes include 

those comprising a sequence shown in SEQ ID NO: 290 as residues: Arg-28 to Gly-34. 

The expression of this gene in fetal spleen indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for treatment/detection of immune 
disorders such as arthritis, asthma, immune deficiency diseases such as AIDS, and 

35 leukemia. In addition the expression of this gene in the early embryo, indicates a key 
role in embryo development and hence the gene or gene product could be used in the 
treatment and or detection of embryonic development defects. This would include 
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treatment or detection of neurodegenerative disease states and behavioral d.sorders such 
as Alzheimer's Disease. Parkinson's Disease. Huntintons Disease, schizophrenia, 
mama, dementia, paranoia, obsessive compulsive disorder and panic disorder and also 
in the treatment and or detection of embryonic development defects. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 58 

The translation product of this gene shares sequence homology vv.th the gene disrupted 
in the neurodegenerative disease dentatorubal-pallidoluysian atrophy. Moreover a long 
open reading fame exists in an alternative frame. Preferred polypeptide fragments 
comprise the following: 

MGSSQSVEIPGGCjTRGYHVLRVQENSPGHKAGLEPFFDFIVSINGSRLNKDND 

TLKDLLKXNVEKPVKMLIYSSKTLELRETSV-ITSNLWGGQGLLGVSIRFCSFD 
GANFNVWHVT FVP^M^pa a i a^_f r> 

- - . — . , - ^vj^ivrnou i hkj. kl j i v ivijn tlJLh SLiE T HE AKP 

LKLYVYNTDTDNCREVIITPNSAWGGEGSLGCGIG YGYLHRIPTRPFEEGKKIS 
LPGQMAGTPITPLKDGFTEVQLSSVNPPSLSPPGTTGIEQSLTGLSISSTPPAVSS 
VLSTGVPTVPLLPPQVNQSLTSVPPMNPATTLPGLMPLPAGLPNLPNLNLNLPA 
PHIMPGVGLPELVNPGLPPLPSMPPRNLPGIAPLPLPSEFLPSFPLVPESSSAASS 
GELLSSLPPTSNAPSDPATITAKADAASSLTVDVTPITAKAPTTVEDRVGDSTPV 
SEKPVSAAVDANASESP (SEQ ID NO:512); SVE1PGGGTEGYHVLRVQENSPGH 

RAGLEPFFDFIVSINGSRLNKDNDTLKDLLKXNVEKPVKMLIYSSKTLELRETS 
VTPSNLWGGQGLLGVSIRFCSFDGANENVWH (SEQ ID NO:5!3): ESNSPAA 

LAGLRPHSDYIIGADTVMNESEDLFSLIETHEAKPLKLYVYNTDTDNCREVIITP 
NSAWGGEGSLGCGIGYGYLHRIPTRPFEEGKKJSLPGQMAGTPITPLKDGFTEV 
QLSS VN PPSLS PPGTTGIEQS LTG LSISS (SEQ ID NO:514); RIPTRPFEEGKKI 

SLPGQMAGTPITPLKDGFTEVQLSSVNPPSLSPPGTTGIEQSLTGLSISSTPPAVS 
SVLSTGVPTVPLLPPQVNQSLTSVPPMNPATTLPGLMPLPAGLPNLPNLNLNIP 
APHIMPGVGLPELVNPGLPPLPSMPPRN (SEQ ID NO:516); PGLPPLPSMPPRN 

LPGIAPLPLPSEFI.PSFPLVPESSSAASSGELLSSLPPTSNAPSDPATTTAKADAA 
SSL'I \'DV7 PPTAKv\PTTVEDRVGDSTPVSEKPVSAAVDAN (SEQ ID NO:517>. 

This gene is expressed primarily in prostate cancer, and to a lesser extent in the 
pineal glands and in fetal lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue! s, or cell type(s) present in a 
bin|(i ; -ir;il snmlf :,n,l r,. r .i; ,,,„,..,, ... i. . , , , 



!VIM1 

mumnological probes !oi d.Heivni.al idcnti t .cat u»n of the n^ci sio, cell lype(.s). f or 



WO 98/54963 



PCT /US98/I1422 



48 

a number of disorders of the above tissues or cells, particularly of the nervous, 
pulmonary, and endocrine systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
5 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 291 as residues: Asn-9 to Leu- 14. 

The abundance of this gene in the pineal gland and its homology to a gene 

10 disrupted in the neurodegenerative disease state Dcntatorubral-pallidoluysian atrophy 
indicates that this gene may be useful in the treatment and/or detection of other 
neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 
Parkinson's Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, 
obsessive compulsive disorder and panic disorder. The abundance of this gene in fetal 

1 5 lung would suggest that misregulation of the expression of this protein product in the 
adult could lead to lymphoma or sarcoma formation, particularly in the lung; that it may 
also be involved in predisposition to certain pulmonary defects such as pulmonary 
edema and embolism, bronchitis and cystic fibrosis; and thus the gen or the gene 
protein encoded by the gene could be used in the detection and/or treatment of these 

20 pulmonary disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 59 

This gene is expressed primarily in the developing embryo. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differentia] identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 

30 the above tissues or cells, particularly of the developmental system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 

35 expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 
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The expression of this gene primarily in the embryo, indicates the gene plays a 
key role in embryo development and that the gene or the protein encoded by the gene 
could be used in the treatment and or detection of developmental defects in the embryo 
or in infants. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 60 

This gene displays homology to nestin, an intermediate filament protein, the 
expression of which correlates with the proliferation of Central Nervous System 
progenitor cells and that is useful in the identification of brain tumors. This gene maps 
to chromosome 1, and therefore, may be used as a marker in linkage analysis for 
chromosome 1 {See Accession No. AA527348). 

This gene is expressed primarily in kidney and to a lesser extent in brain. 
Therefore polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
15 biological sample and for diagnosis of diseases and conditions, which include, but arc 
not limited to, renal disorders and neurodegenerative conditions. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the excretory and 
20 nervous systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 293 as residues: Thr-128 to Asn-135. 

The tissue distribution and homology to nestin indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for detection and/or treatment of 
neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease. 
30 Parkinson's Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, 
obsessive compulsive disorder and panic disorder. In addition, its abundance in kidney 
indicates that it is useful in the treatment and detection of acute renal failure and other 
disease states associated with the kidney. 



25 
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comprise the following amino acid sequence: 

IYKVFRHTAGLKPEVSCFENIRSCARXXXXXXXXXXXXWIFGVLHVVHASVV 
TAYLFTVSNAFQGMFIFLFLCVLSRKIQEEYYRLFKNVPCC (SEQ ID NO:518); 

WIFGVLHVVHASVVTAYLFTVSNAFQGMFIFEFLCVLSRKIQEEYYRLFKNVPC 
5 C (SEQ ID NO:5 19). Also preferred are polynucleotide fragments encoding these 

polypeptide fragments. (See Accession No. 2213659) The translation product of this 
gene shares sequence homology with CD 97, a seven transmembrane bound receptor. 
This gene is expressed primarily in infant brain and in endothelial cells. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disorders and hematopoeitic disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the neurological and 
hematopoeitic systems, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 294 as residues: Lys-13 to Leu-21. 

The tissue distribution of this gene suggest that it may be useful in the detection 
and/or treatment of neurodegenerative disease states and behavioral disorders such as 
25 Alzheimer's Disease, Parkinson's Disease, Huntingtons Disease, schizophrenia, mania, 
dementia, paranoia, obsessive compulsive disorder and panic disorder, while its 
expression in hematopoietic cell types indicates that the gene could be important for the 
treatment or detection of immune or hematopoietic disorders including arthritis, asthma 
and immunodeficiency diseases. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 62 

This gene is expressed primarily in fetal liver and fetal spleen. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematological and immunological disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
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for differential identification of the tissue(s) or cell type(s). For a number of disorders 
ot the above tissues or cells, particularly of the immune and hematopoetic systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 295 as residues: Ser-91 to Lys-98. 

The tissue distribution of this gene fetal liver and spleen indicates that the gene 
could be important for the treatment or detection of immune or hematopoietic disorders 
including arthritis, leukemia, asthma and immunodeficiency diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 63 

Gene shares homology with human serum amyloid protein. Preferred polypeptide 
fragments comprise the following amino acid sequence: 

ALTRIPPGDWVINVTAVSFAGKTTARFFHSSPPSLGDQARTDPGHQRRD (SEQ 
ID NO:520) (See Accession No. W 13671). Also preferred are polynucleotide 
fragments encoding these polypeptide fragments This gene maps to chromosome 9, and 
therefore, may be used as a marker in linkage analysis for chromosome 9 (See 
Accession No. AA004342). 

This gene is expressed primarily in fetal liver and spleen. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematopoietic and immune disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell typc(s). For a number of disorders 
ol the above tissues or cells, particularly of the hematopoietic and immune systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthv ti^m 1 ■ h, ^ n >t ; i t , ... i : ■ 
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The tissue distribution of this gene in fetal liver-spleen indicates that the gene 
could be important for the treatment or detection of immune or hematopoietic disorders 
including arthritis, leukemia, asthma, and immunodeficiency diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 64 

This gene maps to chromosome 3, and therefore, may be used as a marker in 
linkage analysis for chromosome 3 (See Accession No. AA2 19669). 
This gene is expressed specifically in the brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurodegenerative disease states. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differentia) identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the neurological systems, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g.. cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson \s 
Disease, Huntintons Disease, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder and panic disorder. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 65 

Gene shares homology with a yeast protein. Preferred polypeptide fragments 
comprise the following amino acid sequence; LQEVNITLPENSVWYERYKFDIP 
VFHL (SEQ ID NO:521). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. (See Accession No. 1332638) 

This gene is expressed primarily in fetal tissue (fetus and fetal liver). 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, liver disorders and cancers (e.g. hepatoblastoma). Similarly, 



WO 98/54963 



PCT/IJS98/11422 



53 



polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes lor differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the hepatic system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 298 as residues: Asn-59 to Glu-64. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection and treatment of liver disorders 
and cancers (e.g. hepatoblastoma, jaundice, hepainis, iiver metabolic diseases and 
conditions that are attributable to the differentiation of hepatocyte progenitor cells). In 
addition the expression in fetus would suggest a useful role for the protein product in 
developmental abnormalities, fetal deficiencies, pre-natal disorders and various would- 
healing models and/or tissue trauma. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 66 

Gene has homology with a B-cell surface antigen which may indicate gene plays 
a role in the immune response, including, but not limited to disorders and infections of 
the immune system. Preferred polynucleotide fragments comprise the following 
sequence: TAGCATGTAGCCAGTCGAATAACNTATAAGGACAAAGTGGAGTC 
CACGCGTGCGGCCGTCTAGACTAGTGGATCCCCCGGCTGCAGGATTCGGC 
ACGAG (SEQ ID NO:523). Also preferred are polypeptide fragments encoded by 
these polynucleotide fragments (See Accession No.T94535). Additionally, this gene 
shares homology with an intcrferon-gamma receptor. Preferred polypeptide fragments 
also comprise the following amino acid sequence: MQGSGSQPRACFLCFCFSCPC 

SPGGPRWNSRQGGRRFPKTCRAISQNLVFKYKTFCPVRYMQPHRSSLC'LHFI'S 
YVFILSTVVGSLRTYSTDLKKKKKNSRGGPVPIRPKS (SFQ ID NO:522); 

MQGSGSQFRACLLCLCFSCPCSPGGPRWNSRQGGRRFPKTCRAISQNLVFK 
(SEQ ID NO:524); PVRYMQPHRSSLCLHFTSYVFILS TWGSLR TYSTDLKKKKK 
NSRGGPVPIRPKS (SEQ ID NO:525); and GHFQRDCSLGWRGVGMRATIICQAA 
RMFVFFSI PKYAGF (SFO ID Nrvsv, m . . 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell typc(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunological disorders and conditions (immunodeficiencies, cancer, 
5 leukemia, hematopoeisis). Similarly, polypeptides and antibodies directed to these 

polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune and digestive systems, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
10 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 299 as residues: 
15 Thr-41 toGly-52. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and diagnosis of immune 
disorders including: leukemias, lymphomas, auto-immune disorders, immuno- 
supressive (transplantation) and immunodeficiencies (e.g. AIDS), inflammation and 
hematopoeitic disorders. The expression of this gene in gall bladder would suggest a 
possible role for this gene product in digestive disorders, particularly of the pancreas. 



20 



FEATURES OF PROTEIN ENCODED BY GENE NO: 67 

This gene maps to chromosome 1 1, and therefore, may be used as a marker in 
25 linkage analysis for chromosome 1 1 (See Accession No. AA01 1622). 

This gene is expressed primarily in a variety of fetal and developmental tissues 
(e.g. fetal spleen, infant brain). 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental, immune or neurological abnormalities. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the developing 
35 immune and central nervous systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial fluid or spinal fluid) or 
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another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 300 as residues: Ser-38 to Ser-43. 
5 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for developmental abnormalities or fetal 
deficiencies. The detection in infant brain would suggest a role in neurological disorders 
(both developmental and neurodegenerative conditions of the brain and nervous system, 
behavioral disorders, depression, schizophrenia, Alzheimer's disease, Parkinson s 
10 disease, Huntington's disease, mania, dementia). In addition, the detection in spleen 
would similarly suggest a role in detection and treatment of immunologically mediated 
disorders (e.g. immunodeficiency, inflammation, cancer, wound healing, tissue repair, 
hematonoeisis). 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 68 

This gene is expressed primarily in spleen, T-cells, and fetal heart. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

20 not limited to, immunological deficiencies, including AIDSand cardiovascular 

disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune and cardiovascular systems, expression of this gene at significantly higher 

25 or lower levels may be routinely detected in certain tissues (e.g., cancerous and 

wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthv tissue 
or bodily fluid from an individual not having the disorder. 

^° The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the diagnosis and treatment of immune 
disorders including: leukemias, lymphomas, autoimmune disorders, 
immunodeliciencies (e.g. AIDS), immuno suppressive conditions (transplantation) and 
hematopoeitic disorder ^ 'Pi- - w M ... . . • • ♦ • m . i 

ihi omhosis i. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 69 

Gene shares homology with a human collagen protein. Preferred polypeptide 
fragments comprise the following amino acid sequence: 
5 MPRKTSKCRQLLCSGASRNADTAARQSTCSSHRPPGKIPSLGPRRXPGCXSVP 
SSRGEQSTGSPAAPRCGRRDAHRGLPGGAAMTPGDTWASFNPRAGHSKSQGE 
GQESSGASRQDRHPVSHWVERQREAWGAPRSSSAGGVKVAATTEREPEFKIK 
TGKA (SEQ ID NO:527); CSGASRNADTAARQSTCSSHRPPGKIPSLGPRRXPG 
CXSVPSSRGEQSTGSPAAPRCGRRDAHRGLPGGAAMTPGDTWASFNPRAGHS 
10 (SEQ ID NO:528); QGEGQESSGASRQDRHPVSHWVERQREAWGAPRSSSAGG 
VKVAATTEREPEFKIKTGKA (SEQ ID NO:529) (See Accession No. 124886). Also 
preferred are polynucleotide fragments encoding these polypeptide fragments 
This gene is expressed primarily in fetal heart. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

15 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cardiovascular disorders. Similarly, polypeptides and antibodies directed 
to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 

20 tissues or cells, particularly of the cardiovascular system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

25 in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 302 as residues: 
Pro-32 to Ser-39. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and diagnosis of cadiovascular 

30 disorders (e.g. heart disease, restenosis, atherosclerosis, stroke, angina, thrombosis). 

FEATURES OF PROTEIN ENCODED BY GENE NO: 70 

The translation product of this gene shares sequence homology with a chicken 
single-strand DNA-binding protein. Preferred polypeptide fragments comprise the 
35 following amino acid sequence: 

MSPRYPGGPRPPLRIPNQALGGVPGSQPLLPSGMDPTRQQGHPNMGGPMQRM 
TPPRGMVPLGPQNYGGAMRPPLNALGGPGMPGMNMGPGGGRPWPNPTNAN 
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SIPYSSASPGNYVGPPGGGGPPGTPIMPSPADSTNSCiDNMYTLMNAVPPGPNR 
PNFPMGPGSDGPMGGLGGMESHHMNGSLGSGDMDSISKNSPNNMSLSNQP 
GTPRDDGEMGGNF LNPFQSESYSPSMTMSV (SEQ ID NO:530); MSPRYPGG 

PRPPLRIPNQALGGVPGSQPLLPSGMDPTRQQGHPNMGGPMQRMTPPRGMVI> 

5 LGPQNYGGAMRPPLNALGGPGMPGMNMGPGGGRPWPNPTNANSIPYSSASP 
GNY (SEQ ID. NO:531); LNALGGPGMPGMNMGPGGGRPWPNPTNANSIPYSS 
ASPGNYVGPPGGGGPPGTPIMPSPADSTNSGDNMY I LMNAVPPGPN (SEQ ID 
NO:532); GPMGGLGGMESI IHMNGSLGSGDMDSISKNSPNNMSLSNQPG I PR 
DDGEMGGNFENPPQSESYSPSMTMSV (SEQ ID NO 533); TCEHSSEAKAFHDY 

0 (SEQ ID NO:534). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. (See Accession No. 1562534) 

This gene is expressed primarily in placenta and to a lesser extent in the fetal 
heart and a variety of other tissues and cell types. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities, fetal deficiencies, and particularly of the 
cardiovascular system. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 

0 of the tissue(s) or cell type(s). Eor a number of disorders of the above tissues or cells, 
particularly of the reproductive system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

5 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection and treatment of developmental 
abnormalities or fetal deficiencies, ovarian and other endometrial cancers, reproductive 

0 dysfunction, cardiovascular disorders, and pre-natal disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 71 

This gene is expressed primarily in fetal liver and to a lesser extent in the breast 
and testes 

niological sample and tor diagnosis oi disease «> and conditions, winch include, but are 
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not limited to, liver disorders (including hepatoblastomas) and reproductive disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue! s ) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
5 hepatic and reproductive systems, expression of this gene at significantly higher or 

lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 

10 fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for detection and treatment of liver disorders and 
cancers (e.g. hepatoblastoma, jaundice, hepatitis, liver metabolic diseases and 
conditions that are attributable to the differentiation of hepatocyte progenitor cells). The 

15 expression in testes and breast indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the detection and treatment of endocrine and 
reproductive disorders (e.g. sperm maturation, milk production, testicular and breast 
cancers). 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 72 

This gene maps to chromosome 1, and therefore, may be used as a marker in 
linkage analysis for chromosome 1 (See Accession No. W93595). 

This gene is expressed primarily in smooth muscle and to a lesser extent in 

brain. 

25 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cardiovascular and neurological disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 

30 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the cardiovascular and central nervous 
systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

35 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection and treatment of restenosis, 
atherosclerosis, stroke, angina, thrombosis, wound healing and other conditions of 
heart disease. In addition, the expression in brain would suggest that polynucleotides 
5 and polypeptides corresponding to this gene are useful for the detection and treatment of 
developmental, degenerative and behavioral conditions of the brain and nervous system 
(e.g. schizophrenia, depression, Alzheimer's disease, Parkinson's disease, 
Huntington's disease, mania, dementia, paranoia, addictive behavior and sleep 
disorders). 

10 

FEATURES OF PROTEIN ENCODED BY GENE NO: 73 

Gene shares homology with human stromalin-2. Preferred polypeptide 
fragments comprise the following amino aciu sequence: 

QAI VLLSDLLLIPSPQMIVGGRDFLRPLVFEPEA I LQSELASFLMDHVFIQPGDL 

15 GSGA (SFQ ID NO:535); ACSYLLCNPEFTFFSRADFARSQLVDLLTDRFQQF 
LEELLQVG (SFQ ID NO:536),QKQLSSLRDRMVAFCELCQSCLSDVDTEIQEQV 
ST (SEQ ID NO:537); QVILPALTLVYFSILWTLTI IISKSDAS (SEQ ID NO:538); 
STHDLTRWELYEPCCQIXQKAVDTGXVPHQV (SEQ ID NO:539). Also preferred 
ate polynucleotide fragments encoding these polypeptide fragments (See Accession 

20 No.R65208 ) This gene maps to chromosome 7, and therefore, may be used as a 
marker in linkage analysis for chromosome 7 (See Accession No. D52585). 

This gene is expressed primarily in the brain (infant brain, adult brain, pituitary, 
cerebellum, hippocampus, schizophrenic hypothalmus, amygdala). 

Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental and neurodegenerative diseases of the brain and nervous 
system. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 

30 type(s). For a number of disorders of the above tissues or cells, particularly of the 

central nervous system, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample nfc^n f ^rn >n i i j. , : : 
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comprising a sequence shown in SEQ ID NO: 306 as residues: Thr -25 to Lys-36, Lys- 
55 to Ser-63. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for detection and treatment of developmental, 
5 degenerative and behavioral conditions of the brain and nervous system (e.s. 

schizophrenia, depression, Alzheimer's disease, Parkinson's disease, Huntington's 
disease, mania, dementia, paranoia, addictive behavior and sleep disorders). 

FEATURES OF PROTEIN ENCODED BY GENE NO: 74 

10 T his gene is expressed primarily in the hypothalamus of a human suffering from 

schizophrenia. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

15 not limited to, disorders of the CNS particularly schizophrenia. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the CNS, such as schizophrenia 
expression of this gene at significantly higher or lower levels may be routinely detected 

20 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

25 NO: 307 as residues: Gly-38 to Ala-44. 

The tissue distribution indicates that the protein products of this gene are useful 
for the study, diagnosis and treatment of schizophrenia and other disorders involving 
the CNS. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 75 

Preferred polypeptides of the invention comprise the follow ing amino acid 
sequence encoded by this gene: 

LAVSTSFICCADISTALPLGSSRPAPAPRHREHEHGHQARPPRLLXTSLMPLSTP 
AAAQLLWTQLTPMGGRPGGRHSPPTLHTGPRALPPGPPHPSLHVAALSLLR 
35 (SEQ ID NO:540). Polynucleotides encoding such polypeptides are also provided. 

This gene is expressed primarily in endometrial tumor and to a lesser extent in 
amniotic cells. 



BNSDOCID ---WO 



WO 98/54963 



PCT/US98/11422 



61 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, hut are 
not limited to, reproductive and immune disorders particularly cancers of those 
systems. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the reproductive and immune systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disordei. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 308 as residues: Ser-3 to Arg-9. 

The tissue distribution indicates that the protein products of this gene are useful 
for study and treatment of immune and reproductive disorders particularly cancers of 
those systems. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 76 

This gene is expressed primarily in kidney cortex and to a lesser extent in early- 
stage human brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, renal disorders such as renal cancer. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
ol the above tissues or cells, particularly of the kidney expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.si.. 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e.. the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising ■> ... -m-— ■ k ■ • ' . <' * 

u>t simiy. treatment and diagnosis ot renal diseases such as cancer ol the kidnev. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 77 

This gene is expressed primarily in kidney medulla. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, metabolic and renal disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 

10 the above tissues or cells, particularly of the metabolic and renal systems, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 

1 5 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
lor study, treatment and diagnosis of metabolic and renal diseases and disorders. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 78 

This gene is expressed in chronic synovitis and microvascular endothelium. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or ceil type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, arthritis and atherosclerosis. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the vascular and skeletal systems, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 

30 tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

35 Th e tissue distribution indicates that the protein products of this gene are useful 

for study, diagnosis and treatment of arthritic and other inflammatory diseases as well 
as cardiovascular diseases. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 79 

This gene is expressed in resting T-cells and activated monocytes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell typc(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.£., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken horn an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for the study and treatment of immune diseases such as inflammatory conditions. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 80 

This gene is expressed in a variety of immune system tissues, e.g., neutrophils, 
T-cells, and TNF induced epithelial and endothelial cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, infectious and immune disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell typc(s). For a number of disorders of 
the above tissues or cells, particularly of the immune and vascular systems, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g.. serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e.. 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include thos.- fornnnvipn » .. ■ -t ■■ ■ ■ r 

-t;hi\ aiki tteatmenl <»f mlcctiou.s diseases, immune and va.scular disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 81 

This gene is expressed in activated neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation and other immune conditions. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell typc(s). For a number of disorders 
10 of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.«, 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
15 in healthy tissue or bodily fluid from an individual not having the disorder 

The tissue distribution indicates that the protein products of this gene are useful 
for study and treatment of immune disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 82 

20 gene is expressed in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammatory and other immune conditions. Similarly, polypeptides and 

25 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

30 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 315 as residues: 
Ala-83 to Thr-91. 

35 The tissue distribution indicates that the protein products of this gene are useful 

for study and treatment of immune disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 83 

This gene is expressed in human neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation and immune disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s'). For a number of disorders 
of the above tissues or cells, particularly of the immune and inflammatory system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue ui ceii sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for diagnosis and treatment of disorders of the inflammatory and immune systems. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 84 

This gene is expressed in human neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders of the inflammatory and immune systems. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the inflammatory and 
immune systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g.. cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 85 

This gene is expressed in activated neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation and immune system diseases. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

10 of the above tissues or cells, particularly of the immune system and inflammatory 

system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 

15 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for diagnosis and treatment of diseases of the inflammatory and immune systems. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 86 

This gene is expressed in activated neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, inflammation and immune system disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the inflammatory and immune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 

30 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e.. 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder Preferred epitopes include those comprising a sequence shown in SEQ ID 

35 NO: 319 as residues: Met-1 to GIy-6, Gly-32 to Pro-43, Leu 55 to Gln-60. 

The tissue distribution indicates that the protein products of this gene are useful 
for diagnosis and treatment of disorders of the immune and inflammatory system. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 87 

In specific embodiments, polypeptides of the invention comprise the sequence- 

EQVLALLWPRFELILEMNVQSVRSTDPQRLGGLDTRPHYITRRYAEFSSALVSIN 
QTIPNHRTMQLLGQLQVEVENFVLRVAAEFSSRKEQLVFL1NNYDN4MIGVI ME 
RAADDSKEVESFQQLLNARTQEFIEELLSPPFGGLVAFVKEAEALIERGQAERLR 
GEEARVTQLIRGFGSSWKSSVESLSQDVMRSFTNFRNGTSIIQG (SEQ ID 

NO:54I ).ALLKYRFFYQFLLGNERATAKEIRDEYVETLSKIYLS YYRSYLGR1 MK 
VQYEEVAEKDDLMGVEDTAKKGFXSKPSRSRNTIFTLGTRGSVISPTELEAPILV 
PHTAQR (SEQ ID NO: 542); EQRYPFEALFRSQHYXLLDNSCREYLFICEFFVVS 

GPXAMDLFHAVMGRTLSMTLKHLDSYLADCYDAIAVFLCIHIVERFRNIAAKRD 
VPALDRYW (SEQ ID NO:543),GGLDTRPHYITRRYAEFSSAEVSlNQ (SEQ ID 
NO:544i: SR KFOI 

. . .^MmuvjvuonviuiNu: ^y) and/or ALLKYRFFY 

QFLEGNERATAKEIRDEYVETLSKIYLSYYRSYLGRLMKVQYEEVAEKDDLMG 
VEDTAKKGFXSKPSLRSRNTIFTLGTRGSVISPTELEAPILVPHTAQRXEQRYPF 
EALFRSQHYXLLDNSCREYLFICEFFVVSGPXAHDLFHAVMGRTLSMTLKHLD 
SYLADCYDAIAVFLCIHIVLRFRNIAAKRDVPALDRYWEQVLALLWPRFFLILEM 
NVQSVRSTDPQRLGGLDTRPHYITRRYAEFSSAJLVSINQTIPNERTMQLLGQLQV 
EVENFVLRVAAEFSSRKEQLVFLINNYDMMLGVLMERAADDSKEVESFQQLLN 
ARTQEFIEELLSPPFGGLVAFVKEAEALIERGQAERLRGEEARVTQLIRGFGSSW 
KSSVESLSQDVMRSFTNFRNGTS (SEQ ID NO:546>. Polynucleotides encoding 
these polypeptides are also encompassed by the invention. The translation product of 
this gene shares sequence homology with suppressor of actin mutation which is thought 
to be important in mutation suppression. 

This gene is expressed primarily in fetal liver and to a lesser extent in a variety 
ol other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not Innrted to. liver and mutations. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of .he liver or cancer, expression of this gene at 
siunificantlv higher <> r lower |, m.-K: t„. ,,„„;,, .i. . , 



ii:; 11 '.' i.iK; ii l:i 'in 
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in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 320 as residues: 
Val-53 to Arg-60, Thr-88 to Thr-94, Ala- 142 to Ser-150, Gly-188 to GIu-196, Gly- 
208 to Ser-214, Thr-227 to Gly-232, Lys-279 to Phe-285. 
5 The tissue distribution and homology to suppressor of actin mutation suggest 

that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and of liver disorder or cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 88 

10 This £ ene maps to chromosome 9, and therefore can be used in linkage analysis 

as a marker for chromosome 9. In specific embodiments, polypeptides of the invention 
comprise the sequence: 

YEGKEFDYVFSIDVNEGGPSYKLPYNTSDDPWLTAYNFLQKNDLNPMFLDQVA 
KFIIDNTKGQMLGLGNPSFSDPFrGGGRYVPGSSGSSNTLPTADPFTGAGRYV 

15 PGSASMGTTMAGVDPFTGNSAYRSAASKTMNIYFPKKEAVTFDQANPTQILGK 
LKELNGTAPEEKKLTEDDLILLEKILSLICNSSSEKPTVQQLQn.WKAINCPEDIV 
FPALDILRLSIKHPSVNENFCNEKEGAQFSSHLINLLNPKGKPANQLLALRTFC 
NCFVGQAGQKLMMSQRESLMSHAIELKSGSNKNI (SEQ ID NO: 547); 
HIALATLALNYSVCFHKD (SEQ ID NO: 548); HNIEGKAQCLSLISTILEVVQ 

20 DLEATFRLLVALGTLISDDSNAVQLAKS (SEQ ID NO:549); LGVDSQIKKYSS 

VSEPAKVSECCRFILNLL (SEQ ID NO:550): and/or YEGKEFDYVFSIDVNEGGPS 
YKT-PYNTSDDPWLTAYNFLQKNDLNPMH.DQVAKFIIDNTKGQMLGLGNPSFS 
DPFTGGGRYVPGSSGSSNTLPTADPFTGAGRYVPGSASMGTTMAGVDPFTGN 
SAYRSAASKTMNIYFPKKEAVTEDQANPTQILGKLKELNGTAPEEKKLTEDDLI 

25 LLEK1LSLICNSSSEKPTVQQLQILWKAINCPEDIVFPALDILRLSIKHPSVNENFC 
NEKEGAQFSSHLINLLNPKGKPANQLLALRTFCNCFVGQAGQKLMMSQRESL 
MSHAEELKSGSNKNIHIALATLALNYSVCFHKDHNIEGKAQCLSLISTrLEVVQD 
LEATFRLLVALGTLISDDSNAVQLAKSLGVDSQIKKYSSVSEPAKVSECCRFILN 
LL (SEQ ID NO:551)- Polynucleotides encoding these polypeptides are also 

30 encompassed by the invention. These polypeptides share significant homology with 
phospholipase A2 activating protein which is thought to be important in signal 
transduction (see, e.g., Wang et al., Gene 16 1 (2):237-24 1 (1995)). 

This gene is expressed primarily in endothelial cells, to a less extent in placenta, 
endometrial stromal cells, osteosarcoma, testis tumor, muscle, and infant brain that are 

35 likely to be rich in blood vessles. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
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biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders in vascular system, aberrant angiogenesis, tumor aneio^encsis. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
5 type(s). For a number of disorders of the above tissues or cells, particularly of the 
vascular system or tumors, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

10 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution of this gene in endothelial cells and several potential 
highly vascularized tissues and its homology 10 phosphoiipasc A2 activating protein 
suggest that this gene may be involved in transducing signals for endothelial cells in 

1 5 angiogenesis or vasculogenesis. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 89 

In specific embodiments, polypeptides of the invention comprise the sequence: 
YPNQDGDILRDQVLHEHIQRLSKVVTANHRALQIPEVYLREAPWPSAQSEIRTIS 

20 AYKTPRDKVQCILRMCSTIMNLLSLANEDSVPGADDFVPVLVFVLIKANPPCLL 
STVQYISSFYASCLSGEESYWWMQFTAAVE (SEQ ID NO:552); YPNQDGDILR 
DQVLHEHIORLSKVVTANHRALQIPEVYLREAPWPSAQSEIRTISAYKTPRDKVQ 
CILRMCSTIMNLLSLANEDSVPCJADDFVPVLVFVLIKANPPCLLSTVQYISSFYA 
SCLSGEESYWWMQFTAAVEFIKTI (SEQ ID NO:553); YPNQDGDILRDQVL (SEQ 

25 ID NO:554); EAPWPSAQSEI (SEQ ID NO:555); PVLVFVLIKANP (SEQ ID 
NO:560); SGEESYWWMQFTAAVEFIKTI (SEQ ID NO:556); ADDFVPVEVF 
VLIKANPP (SEQ ID NO:557): YKTPRDKVQCIL (SEQ ID NO:558): and/or 
GADDFVPVLVFVLIK (SEQ ID NO:55<». The translation product of this gene shares 
sequence homology with human ras inhibitor and yeast VPSOp which is thought to be 

30 important in golgi vacuole transport. 

This gene is expressed primarily in T cells and melanocytes and to a lesser 
extent in a variety of other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identifir :»mm" ^fti,., .;. ,, <■ 
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immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
5 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in heaJthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to ras inhibitor indicates that 
10 polynucleotides and polypeptides corresponding to this gene are useful for regulating 
signal transduction; diagnosis and treatment of disorders involving T cells and 
melanocytes. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 90 

1 5 Tnis £ en e maps to chromosome 9 and therefore polypeptides of the invention 

can be used in linkage analysis as a marker for chromosome 9. The translation product 
of this gene shares sequence homology with neuronal olfactomedin-related ER localized 
protein which is thought to be important in influence the maintenance, growth, or 
differentiation of chemosensory cilia on the apical dendrites of olfactory neurons. In 

20 specific embodiments, polypeptides of the invention comprise the sequence: 

S ARAS TQPPAGQHPGPC (SEQ ID NO:561); MPGRWRWQRDMHPARKLLSLL 
FLIEMGTELTQD (SEQ ID NO:562); SAAPDSLLRSSKGSTRGSL (SEQ ID 
NO:563); AAIVIWRGKSESRIAKTPGI (SEQ ID NO:564); FRGGGTLVLPPTHT 
PEWLIL (SEQ ID NO:567); PLGITLPLGAPETGGGD (SEQ ID NO:565); and/or 

25 CAAETWKGSQRAGQLCALLA (SEQ ID NO:566). 
This gene is expressed in pineal gland. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

30 not limited to, neurological and endocrinological disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the neurological or endocrine systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 

35 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e.. 
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the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 323 as residues: Leu-20 to Ala-26, Arg-32 to Arg-39, Thr-104 to Gly-1 12. 

The tissue distribution and homology to olfactomedin-related protein indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
maintenance, growth, or differentiation of neuron cells in pineal gland, therefore, may 
be useful for diagnosis and treatment of neurological disorders in pineal gland. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 91 

This gene is expressed primarily in prostate and apoptotic T cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, prostate disease and T cell dysfunction. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the prostate cancer, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for detect abnormal activity in prostate and T cells 
or probably treatment of this abnormality. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 92 

This gene is expressed primarily in prostate and to a lesser extent in smooth 
muscle cells, fibroblasts, and placenta. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders in prostate or vascular system. Similarly, polypeptides and 
antibodies directed to these polypeptides an- m^'mI mwm - ; i,^ . i 

- 'I this gene at Mgnit icanlly higher or lower levels may be routinely detected in certain 
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10 



tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., scrum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene arc useful for regulating function of prostate or highly 
vascularized tissues, e.g. placenta. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 93 

This gene is expressed primarily in embryos and fetal tissues stage human and 
to a lesser extent in a wide variety of other proliferative tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

1 5 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders in embryonic development and cell proliferation. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the embryonic tissues 

20 and proliferative cells, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 

25 fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis or treatment of abnormalities in 
developing and proliferative cells and organs. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 94 

The translation product of this gene shares sequence homology with 
transformation related protein which is thought to be important in transformation. 

This gene is expressed primarily in female reproductive tissues, i.e., breast 
cancer cells, placenta, and ovary and to a lesser extent in fetal lung. 
35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, cancer or dysfunction of reproductive tissues. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the reproduction system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 327 as residues: Ser-50 to Pro-61. 

The tissue distribution and homology to transformation related protein indicates 
that pol\ nucleotides and polypeptides corresponding to this gene are useful lor 
diagnosis and treatment of conditions caused by transformation, i.e. tumorigenesis in 
reproductive organs, e.g. breast, placenta, and ovary. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 95 

This gene is expressed primarily in testes, rhabdomyosarcoma, infant brain and 
to a lesser extent in some tumors and highly vascularized tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to. tumorigenesis, abnormal angiogenesis, and/or neurological disorders. , 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
tumor tissues or vascular tissues, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SFQ ID NO: 328 as residues: Arg-46 to Trp-54, Pro- 
60 to Ile-60. Asn 116to Ahi-l? 1 -Vr- M"7 t, , T v - 1^ v • ^ - - 

t oiicsponding to this gene are useful for a range of disease states including treatment of 
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tumor or vascular disorders and the treatment of neurological disorders such as 
Alzheimer's Disease. Parkinson's Disease, Huntingtons Disease, schizophrenia, mania, 
dementia, paranoia, obsessive compulsive disorder and panic disorder. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 96 

This gene maps to chromosome 7 and therefore polynucleotides of the present 
invention can be used in linkage analysis as a marker for chromosome 7. The 
translation product of this gene is homologous to the Clostridium perfringens 
enterotoxin (CPE) receptor gene product and shares sequence homology with a human 
ORF specific to prostate and a glycoprotein specific to oligodendrocytes both of which 
are tissue specific proteins. (See e.g., Katahira et ah, J Cell Biol. 1 36(6): 1 239- 1 247 
(1997). PMID: 9087440; UI: 97242441. 

This gene is expressed primarily in pancreas tumor and ulcerative colitis and to a 
lesser extent in several tumors and normal tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, pancreatic disorder, ulcerative colitis, tumors and food poisoning. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
digestive system or tumongenic system, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO: 329 as residues: Gly-147 to Met- 
152, Cys-177 to Lys-188. 

The tissue distribution and homology to prostate and oligodendrocyte-specific 
protein indicates that polynucleotides and polypeptides corresponding to this gene are 
useful for marker of diagnosis or treatment of disorder in pancreas, ulcerative colitis, 
and tumors. Furthermore, identity to the human receptor for Clostridium perfnngenes 
entertoxin indicates that the soluble portion of this receptor could be used in the 
treatment of food poisoning associated with Clostridia perfringens by blocking the 
activity of perfringens enterotoxin. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 97 

The translation product of this gene shares sequence homology with ATPase 
which is thought to he important in metabolism. 

This gene is expressed primarily in testes and several hematopoietic cells and to 
a lesser extent in other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, leukemia and hematopoietic disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the hematopoietic system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 330 as residues: Leu-37 to Ala-42. 

The tissue distribution and homology to ATPase indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for marker of diagnosis and 
treatment of leukemia and other hematopoietic disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 98 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MRSARPSEGCEPSWAFSQALNI (SEQ ID NO:568); LLGLKGLAPAEISAVCE 
KGNFN (SEQ ID NO:569); VAHGLAWSYYIGYLRLILPELQARIR (SEQ ID 
NO:570); TYNQH YNNLLRGA VSQRC (SEQ ID NO:57 1 ); ILLPLDCGVPDNLSM 
ADPNIRFLDKLPQQTGDRAGIKDRVYSN (SEQ ID NO:572): SIYELLENGQRAGT 
CVLEYATPLQTLFAMSQYSQAGFSGEDRLEQ (SEQ ID NO:573); AKLFCRTLE 
DILADAPESQNNCRLIA YQEPADDSSFSLSQEVLRI ILRQEEKEFVTVGSLKTS A V 
PSTSTMSQEPELLISGMEKPLPLRTDFS (SEQ ID NO:574); and/or LLGLKGLA 
PAEISAVCFKGNFNVAHGI .AWSYYIGYT RT H PF T ^ n TT ^-, , - 
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Therefore, polynucleotides and polypeptides of the invention are usef ul as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, benign prostatic hypertrophy or prostate cancer. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes lor differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the male urinary system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
m certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 331 as residues: IIe-60 to Asn-69, Leu- 106 to Asp-1 12, Glu-130 to Gly-136, Phe- 
160 to Glu-167, Pro- 184 to Cys-190, Glu-197 to Ser 202, Arg 215 to Glu-221. Thr- 
237 to Pro-242. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis or treatment of benign prostatic 
hypertrophy or prostate cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 99 

This gene is expressed primarily in salivary gland. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell typc(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders or injuries of the salivary gland. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of glandular tissues, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment of disorders of, or injuries to the 
salivary gland or other glandular tissue. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 100 

This gene maps to chromosome 15, accordingly, polynucleotides of the 
invention can be used in linkage analysis as a marker for chromosome 15. The 
translation product of this gene shares sequence homology with a C.elegans gene of 
unknown function. In specific embodiments, polypeptides of the invention comprise 
the sequence: DPR VRLNSLTCKHIPISLTQ (SEQ ID NO:583); TMKLLKLRRNI V 
KLSLYRHFTN (SEQ ID NO:576); TLILAVAASIVFIIWTTM KFRI (SEQ ID 
NO:577); VTCQSDWRELWVDDAIWRLLFSM1LFVI (SEQ ID NO:578); MVLWR 
PSANNQRFAFSPLSEEEEEDEQ (SEQ ID NO:580): KEPMLKESFEGMKMRS 
TKQEPNGNSKVNKAQEDDL (SEQ ID NO:5K4); and/or KW VEEN VPSSVTDV ALP 
ALLDSDEERMFniFERSKME (SEQ ID NO:582). Polynucleotides encoding these 
polypeptides are also encompassed by the invention. 

This gene is expressed primarily in thyroid and to a lesser extent in 
osteoclastoma, kidney medulla, and lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, thyroid dysfunction or cancer. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the endocrine system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SFQ ID NO: 333 as residues: 
Lys 107 to Leu- 124, Glu-150 to Thr-159, Pro- 173 to Asp- 1 79, Ser-192 to Ser-201. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of thyroid dysfunction 
or cancer. 

FEATURES OF PROTEIN ENCODED KY OFNF vm w»i 
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IRHELTVLRDTRPACA (SEQ ID NO:585): and/or MDFXMALIYD (SEQ ID 
NO:586). Polynucleotides encoding these polypeptides are also encompassed by the 
invention. 

This gene is expressed primarily in kidney cortex and to a lesser extent in adult 
brain, corpus colosum, hippocampus, and frontal cortex. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the central nervous system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial 
1 5 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

such a disorder, relative to the standard gene expression level, i.e.. the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment or diagnosis of neurological 
20 disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 102 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MQEMMRNQDRALSNLESIPGGYNA (SEQ ID NO:587); LRRMYTDIQEPMLSA 

25 AQEQF GGNPF (SEQ ID NO:588): ASLVSNTSSGEGSQPSRTENRDPLPNPWAP 
QT (SEQ ID NO:589); SQSSSASSGTASTVGGTTGSTASGTSGQSTTAPNLVPGV 
GASMFNTPG MQSLLQQITENPQLMQNMLSAPY (SEQ ID NO:590); 
MRSMMQSLSQNPDLAAQMMLNNPLFAGNPQLQEQMRQQLPTFLQQ (SEQ ID 
NO:591 ); MQNPDTLSAMSNPRAMQALLQIQQGLQTLATEAPGLIPGFTPGLG 

30 ALGSTGGSSGTNGSNATPSENTSPTAGT (SEQ ID NO:592); TEPGHQQFI 

QQMLQALAGVNPQLQNPEVRFQQQLEQLSAMGFLNREANLQAL1ATGGDINA 
IERLLGSQPS (SEQ ID NO:593); RNPAMMQEMMRNQDRALSNLESIPGGY 
NALRRMYTDIQEPMLSAA (SEQ ID NO:594); GNPFASLVSNTSS (SEQ ID 
NO.-595); ENRDPLPNPWA (SEQ ID NO:595); GKILKDQDTLSQHGIHD (SEQ ID 
35 NO:597); GLTVHLVIKTQNRP (SEQ ID NO:598); SELQSQMQRQLLSNPEMM 
(SEQ ID NO:599); PEISHMLNNPDIMR (SEQ ID NO:600); and/or 
RQLLMANPQMQQLIQRNP (SEQ ID NO:601). Polynucleotides encoding these 
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polypeptides are also encompassed by the invention. 
This gene is expressed primarily in breast. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, breast cancer. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of tumor systems, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of some types of 
breast cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 103 

The translation product of this gene shares sequence homology with secreted 
serine proteases and lysozyme C precursor, which is thought to be important in 
bacteriolytic function. In specific embodiments, polypeptides of the invention comprise 
the sequence: NLCHVDCQDLLNF^NI.LAGIUCAKRIVS (SEQ ID NO:602); 
LDGFEGYSLSDWLCLAFVESKFN (SEQ ID NO:6()3); 
NENADGSFDYGLEQLNSHYWCN (SEQ ID NO:604); and/or 
NLCHVDCQDLLNPNLLAGIHCAKRIVS (SEQ ID NO:605). Polynucleotides 
encoding these polypeptides are also encompassed by the invention. 

This gene is expressed primarily in testes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases arid conditions, w hich include, but are 
not limited to, infection. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue* s) or cell tvpef O For ;i nnmN-r ^ ,: ■ ' - r ■'■ : 
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another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 336 as residues: Ile-62 to Phe-70, Asn- 
78 to Asn-84. 

The tissue distribution and homology to lysozyme C precursor indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for boosting the 
moncyte-macrophage system and enhance the activity of immunoagents. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 104 

This gene is expressed primarily in apoptotic T-cell. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the ttssue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of some immune 
disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 105 

The translation product of this gene shares sequence homology with ARI 
protein of Drosophila (accession 2058299; EMBL: locus DMARIADNE, accession 
X98309), which is thought to be important in axonal path-finding in the central nervous 
system. In specific embodiments, polypeptides of the invention comprise the sequence 
IREVNEVIQNPAT (SEQ ID NO:606); ITRILLSHFNWDKEKLMERYF 
DGNLEKLFA (SEQ ID NO:607); NTRSSAQDMPCQICYLNYPNSYF (SEQ ID 
NO:608); TGLECGHKFCMQCWSEYLTTKIMEEGMGQTISCPAHG (SEQ ID 
NO:614);CDILVDDNTVMRLITDSKVKLKYQHLITNSFVECNRLLKWCPAPD 
CHHVVKVQYPDAKPV (SEQ ID NO:609): CDILVDDNTVMRLITDSK 
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VKLKYQHLITNSFVECNRLLKWCPAPDCHHVVKV (SEQ ID NO:610); 

GCNHMVCRNQNCKAEFCWVCLGPWEPHGSAWYNCNRYNEDDAKAARDAQE 
RSRAAEQRYL (SEQ ID NO:61 1); FYCNRYMNHMQSERFEHKEYAQVKQ 
KMEEMQQHNMSWIEVQI^LKKAVDVLCQCRATLMYT (SEQ ID NO: 612); 

5 YVFAFYLKKNNQSUFENNQADLENATEVLSGYL.ERDISQDSLQDIKQKVQDKY 
RYCESR (StiQ ID NO:613) Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

This gene is expressed primarily in adult brain, and to a lesser extent in 
endometrial tumor, melanocytes, and infant brain. 

0 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases or injuries, involving axon a! path development. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

5 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the central nervous 
system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

0 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to ARI protein indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for treatment of 

5 disease states or injuries involving axonal path development, including 
neurodegenerative diseases and nerve injury. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 106 

The translation product of this gene shares sequence homology with cytochrome 
0 b>61 [Sus scrota] which is thought to be an integral membrane protein of 
neuroendocrine storage vesicles of neurotransmitters and peptide hormones. 

This gene is expressed primarily in frontal cortex and to a lesser extent in 
rhabdomyosarcoma. 

Therefore, polynucleotides :md ^oiv p ,, pft ■ > 4 - l • • • • • 

:mnk\i m. neurological disorders. Similaiiv. polypeptides and antibodies directed to 
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these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the central nervous system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
5 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 339 as residues: 
10 Ser-18 to Pro-24. 

The tissue distribution and homology to cytochrome b561 fSus scrofa] indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
treatment and diagnosis of neurological disorders. This gene may also be important in 
regulation of some types of cancers. 

15 

FEATURES OF PROTEIN ENCODED BY GENE NO: 107 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MWGYLFVDAAWNFLGCLICGW (SEQ ID NO:615): MHFISSGNVSAIRSSILLL 
RXSLSYLGNCLRVSAIFVYFLLFLLLS (SEQ ID NO:616); and/or MDQALRGSPSE 
20 GFSTDPSPPQVGRQIPSFPPWRRLVLPKASGCFLEREWWLCVFKLRTRPGAEA 
HAYNSSILGGRGKGIT (SEQ ID NO:617). Polynucleotides encoding these 
polypeptides are also encompassed by the invention. 

This gene is expressed primarily in pancreas tumor and to a lesser extent in 
cerebellum. 

25 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, pancreatic tumors. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differentia] 

30 identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the endocrine system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

35 such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
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epitopes include those comprising a sequence shown in SEQ ID NO: 340 as residues: 
Pro- 2 2 to Phe-33. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of pancreatic tumors. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 108 

This gene maps to chromosome 17 and therefore polynucleotides of the 
invention can be used in linkage analysis as a marker for chromosome 17. In specific 
embodiments, polypeptides of the invention comprise the sequence: 

MLPALASCCHFSPPEQAARLKKLQEQEKQQKVEFRKRMEKEVSDFIQDSGQIK 
KKFQPMNKIERSILHDVVEVAGLTSFSFGEDDDCRYVMIFKKEFAPSDEELDSY 
RRGEEWDPQKAEEKRNXKELAQRQ (SEQ ID NO:618); EEEAAQQGPVVV 
SPASDYKDKYSHLIGKGAAKDAAHMI^ 

IRAKKRERQSGE (SEQ ID NO:619); PPRRPAQLPLTPGAGQGAGRDKAAAIRA 
HPGAPPLNHLLP (SEQ IDNO:620); AVPQAGGKQVFDLSPLELGYVRGMCVCV 
(SEQ ID NO:62 1) and/or MLPALASCCHFSPPEQAARLKKLQEQEKQQKVEFRK 
RMEKEVSDFIQDSGQIKKKFQPMNKIERSILHDVVEVAGLTSFSFGEDDDCRYV 
MIFKKEFAPSDEELDSYRRGEEWDPQKAEEKRNXKELAQRQEEEAAQQGPVVV 

SPASDYKDKYSHLIGKGAAKDAAHMLQANKTYGCXPVANKRDTRSIEEAMNE 
IRAKKRLRQSGE (SEQ ID NO:622). Polynucleotides encoding these polypeptides 
are also encompassed by the invention. The translation product of this gene shares 
sequence homology with FSA-1 which may play a role as a structural protein 
component of the acrosome. 

This gene is expressed primarily in fetal kidney and sperm. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell typc(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, male reproductive disorders, especially involving acrosomal disfunction. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
typc(s). For a number of disorders of the above tissues or cells, particularly of the male 
reproductive system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine svnnvJ' ( l fin. | -w- ,i n,.- i . . 
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individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 341 as residues: Glu-8 to Asn-35. 

The tissue distribution and homology to FSA-1 indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for treatment of infertility due to 
5 acrosomal disfunction of sperm. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 109 

This gene is expressed primarily in pituitary and to a lesser extent in 
epididymus. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, male reproductive disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 

15 differential identification of the tissue(s) or cell typc(s). For a number of disorders of 
the above tissues or cells, particularly of the male reproductive system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

20 individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 342 as residues: Met-1 to Trp-6. 

Because the gene is found in both pituitary and epididymus, this indicates that 

25 polynucleotides and polypeptides corresponding to this gene are useful for treatment 
and diagnosis of male reproductive disorders. This may involve a secreted peptide 
produced in the pituitary targeting the epididymus. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 110 

30 In specific embodiments, polypeptides of the invention comprise the sequence: 

LLCPVLNSGXSWNFPHPSQPEYSFHGFHSTRLWI (SEQ ID NO:623); and/or 
PSTPWFLFLLGLTCPFSTSHPRWDSIPP (SEQ ID NO:624). Polynucleotides 
encoding these polypeptides are also encompassed by the invention. 
This gene is expressed primarily in resting T-cells. . 

35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, T-cell disorders. Similarly, polypeptides and antibodies direeted to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of certain immune 
disorders, especially those involving T-cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 111 

This gene is expressed primarily in cerebellum and whole brain and to a lesser 
extent in infant brain and fetal kidney. 

Therefore, polynucleotides and polypeptides of the invention arc useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the central nervous system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.s>., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 344 as residues: 
Asp-48 to Gly-55. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neurological 
disorders. 
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is thought to be important in the early assembly of ribosomes (See Accession No. 
M380I6). This gene maps to chromosome 1, and therefore, may be used as a marker 
in linkage analysis for chromosome 1 . 

This gene is expressed primarily in developmental tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, development of cancers and tumors in addition to healing wounds. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune and developmental expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution and homology to nbosomalprotein sl5 of E. coli 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
diseases related to the assembly of ribosomes in the mitochondria which is important in 
the translation of RNA into protein. Therefore, this indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis and intervention of 
multiple tumors as well as in healing wounds which are thought to be under similar 
regulation as developmental tissues. Protein, as well as, antibodies directed against the 
protein have utility as tumor markers, in addition to immunotherapy targets, for the 
above listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 113 

The translation product of this gene shares sequence homology with human 
poliovirus receptor precursors which are thought to be important in viral binding and 
uptake. Preferred polypeptide fragments comprise the following amino acid sequence: 
ELSISISNVALADEGEYTCSIFTMPVRTAKSLVTVLGIPQKPIITGYKSSLREKDT 
ATLNCQSSGSKPAARLTWRKGDQELHGEPTRIQEDPNGKTFrVSSSVTFQVTR 
EDDGASIVCSVNHESLKGADRSTSQRIEVLYTPTAMIRPDPPHPREGQKLLLHC 
EGRGNPVPQQYLWEKEGSVPPLKMTQESALIFPFLNKSDSGTYGCTATSNMGS 
YKAYYTLNVND (SEQ ID NO:625). Also preferred are polynucleotide fragments 
encoding these polypeptide fragments (See Accession No. gnllPIDtd 1002627). 
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This gene is expressed almost exclusively in human brain tissue. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, susceptibility to viral disease and diseases of the CNS especially cancers 
of that system. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the central nervous system, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
15 comprising a sequence shown in SEQ ID NO: 346 as residues: Leu-26 to Asp-37, Lys- 
53 to Ser-59. 

The tissue distribution and homology to poliovirus receptor precursors indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for the 
treatment and prevention of diseases that involve the binding and uptake of virus 
20 particles for infection. It might also be helpful in genetic therapy where the goal is to 
insert foreign DNA into infected cells. With the help of this protein, the binding and 
uptake of this foreign DNA might be aided. In addition, it is expected that over 
expression of this gene will indicate abnormalities involving the CNS, particularly 
cancers of that system. 



25 



30 



FEATURES OF PROTEIN ENCODED BY GENE NO: 114 

The translation product of this gene shares sequence homology with 
YOS7_CAEFL hypothetical 28.5 KD protein ZK 1236.7 in chromosome III of 
Caenorhabditis elegans in addition to alpha- 1 collagen type III (See Accession No. 
gil537432). One embodiment for this gene is the polypeptide fragments ) comprising 
the following amino acid sequence: VPELPDRVHQLHQA VQGCALGRPGFPGGPTH 

SGUHKSHPGPAGGDYNRCDRPGQVHLHNPRGTGRRGQLUPTAGPGVHRRA 
CPSQQLPHRLGPGVPCPSPSLTPVLPSW TQSWCG LPGYTSSS (SEQ ID 
NO:630). An additional embodiment t^-> pok-p".-' ; r 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurodegeneration and lmunological disorders. Similarly, polypeptides 
5 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the neural and immune systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 347 as residues: Glu-34 to Glu-39, Gly-5 1 to Ser-72, AIa-88 to Glu-93, Gin- 1 00 

15 to Val-105. 

The tissue distribution and homology to Y087_CAEEL hypothetical 28.5 KD 
protein ZK 1236.7 in chromosome III of Caenorhabditis elegans as well as to a 
conserved alpha- 1 collagen type III protein indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the detection and treatment of 

20 neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 
Parkinson's Disease, Huntingtons > Disease, schizophrenia, mania, dementia, paranoia, 
obsessive compulsive disorder and panic disorders. Because the gene is expressed in 
cells of lymphoid origin, the natural gene product may be involved in immune 
functions. Therefore it may be also used as an agent for immunological disorders 

25 including arthritis, asthma, immune deficiency diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 115 

The translation product of this gene shares sequence homology with alpha 3 
type IX collagen which is thought to be important in hyaline cartilage formation via its 

30 ability to uptake inorganic sulfate by cells (See Accession No. gil975657). One 

embodiment of this gene is the polypeptide fragment comprising the following amino 
acid sequence: SLRRPRSAAXQTLTTFLSSVSSASSSALPGSREPCDPRAPPPPR 
SGSAASCCSCCCSCPRRRAPLRSPRGSKRRIRQREVVDLYNGMCLQGPAGVPG 
RDGSPGANGIPGTPGIPGRDGFKGEKGECLRESFEESWTPNYKQCSWSSLNY 

35 GIDLGKJAECTFTKJV1RSNSALRVLFSGSLRLKCRNACCQRWYFTFNGAECSGP 
LPIEAIIYLDQGSPEMNSTINIHRTSSVEGLCEGIGAGLVDVAIWVGTCSDYPKG 
DASTGWNSVSRIIIEELPK (SEQ ID NO:634). An additional embodiment are the 



BNSDCXjID - WO 9854963A2.I 



WO 98/54963 



PCT/US98/1I422 



polynucleotide fragments encoding this polypeptide fragment , 

This gene is expressed primarily in smooth muscle and to a lesser extent in 
synovial tissue. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, dwarfism, spinal deformation, and specific joint abnormalities as well as 
chondrodysplasias i.e., spondyloepiphyseal dysplasia congenita, familial osteoarthritis, 
Atelosteogenesis type II, metaphyseal chondrodysplasia type Schmid and autoimmune 
disorders . Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissuc(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the skeletal system, expression of this gene at significantly highei or lower levels may 
be routinely detected in certain tissues (e.g.. cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to alpha 3 type IX collagen indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the treatment 
and diagnosis of diseases associated with the mutation in this gene which leads to the 
many different types of chondrodysplasias. By the use of this product, the abnormal 
growth and development of bones of the limbs and spine could be routinely detected or 
treated in utero since the protein or muteins thereof could affect epithelial cells early in 
development and later the chondrocytes of the developing craniofacial structure. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 116 

The translation product of this gene shares sequence homology with retrovirus- 
related reverse transcriptase which is thought to be important in viral replication. One 
embodiment for this gene is the polypeptide fragments comprising the following amino 
acid sequence: TKKENCRPASLMNIDTKILNKILMNQ (SEQ ID NO:640>. An 
additional embodiment is the polynucleotide fragments encoding these polypeptide 
fragments (See Accession No. pirlA253 1 3IGNI IUL 1 ). 

This gene is expressed primarily in human M^-n;,, ■;<>, 

i i^l* ^: k at sample and tor diagnosis ot diseases and conditions, which include, but are 
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not limited to, retroviral diseases such as AIDS, and possibly certain cancers due to 
transactivation of latent cell division genes. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
5 the above tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., scrum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

10 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to retrovirus-related reverse transcriptase 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
the detection and treatment of diseases and maladies associated with retroviral infection 
since a functional reverse transcriptase (RT) or RT-like molecule is an integral 

1 5 component of the retroviral life cycle. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 117 

The translation product of this gene shares sequence homology with an 
unknown gene from C. elegans, as well as weak homolog with mammalian metaxin, a 
20 gene contiguous to both thrombospondin 3 and glucocerebrosidase, is known to be 
required for embryonic development. Preferred polypeptide fragments comprise the 
following amino acid sequence: MCNLPIKVVCRANAEYMSPSGKVPXXHVGNQ 

vvselgpivqpa/kakghslslxjl^ 

DEATVGXITHXRYGSPYPWPLXHILAYQKQWEVKRKXKAIGWGKKTLDQVLE 
25 DVDQCCQALSQRLGTQPYFP^KQPTELDALVFGHL^ 

YSNLLAFCRRI EQHYFEDRGKGRLS (SEQ ID NO:641); MCNLPIKVVCRANAE 
YMSPSGKVPXXHVGNQVVSELGPIVQFVK (SEQ ID NO:642), Also preferred are 
polynucleotide fragments encoding these polypeptide fragments (See Accession No. 
gif 1326108). 

This gene is expressed primarily in fetal tissues and to a lesser extent in 
hematopoietic cells and tissues, including spleen, monocytes, and T cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer; lymphoproliferative disorders; inflammation; chondrosarcoma, 
and Gaucher disease. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
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of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the hematopoietic and embryonic systems, expression of this eene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.u., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
5 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of cancer and other 

10 proliferative disorders. Expression in embryonic tissue and other cellular sources 

marked by proliferating cells indicates that this protein may play a role in the regulation 
or cellular division. Additionally, the expression in hematopoietic cells and tissues 
indicates that this protein may play a role in the proliferation, differentiation, and 
survival of hematopoietic cell lineages. Thus, this gene may be useful in the treatment 

1 5 of ly myeloproliferative disorders, and in the maintenance and differentiation of various 
hematopoietic lineages from early hematopoietic stem and committed progenitor cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 118 

The translation product of this gene shares sequence homology with reverse 
20 transcriptase which is important in the synthesis of a cDNA chain from an RNA 
molecule, and is a method whereby the infecting RNA chains of retroviruses are 
transcribed into their DNA complements. One embodiment for this gene is the 
polypeptide fragment comprising the following amino acid sequence: 

MXXXNSHITIFrLNVNGLNAPNERHRLANWIQSQDQVCCIQETHLTORDTHRL 
25 KIKGWRKIYQANGKQKK (SEQ ID NO:647>. An additional embodiment is the 
polynucleotide fragments comprising polynucleotides encoding these polypeptide 
fragments (See Accession No. gil2072964). 

This gene is expressed primarily in skin and to a lesser extent in neutrophils. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
30 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions which include, but arc 
not limited to, cancer, hematopoietic disorders; inflammation; disorders of immune 
surveillance. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes f, w • . !iff.*r-.^ t j -1 : ] lt . . < ,i 

mgtk'i oi iowci ic\el.s may be ioutmcly detected in certain tissues (e.g.. cancerous and 
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wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to reverse transcriptase indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for cancer 
therapy. Expression in the skin also indicates that this gene is useful m wound healing 
and fibrosis. Expression by neutrophils also indicates that this gene product plays a role 
in inflammation and the control of immune surveillance (i.e. recognition of viral 
pathogens). Reverse transcriptase family members are also useful in the detection and 
treatment of AIDS. 



FEATURES OF PROTEIN ENCODED BY GF:NE NO: 119 

The translation product of this gene shares sequence homology with reverse 
15 transcriptase which is important in the synthesis of a cDNA copy of an RNA molecule, 
and is a method whereby a retrovirus reverse-transcribes its genome into an inheritable 
DNA copy. 

This gene is expressed primarily in the frontal cortex of brain. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
20 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer and neurodegenerative disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the CNS and peripheral nervous system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
30 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to reverse transcriptase suggest that this is 
useful in the treatment of cancer and AIDS. The expression in brain indicates that it 
plays a role in neurodegenerative disorders and in neural defeneration. 

35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 120 

One embodiment of this gene has homology to a hypothetical proton in 
Sch.zosaccharomyces pombc (See Accession No. 2281980). Another embodiment for 
this gene ,s the polypeptide fragments comprising the followina am.no acid sequence- 

IYHLHSWIFFHFKRAFCMCFITMKVIHAHCSKLRKCXNAQISVFCTTLTASYPT 
(SFQ ID NO:65I ). An additional embodiment is the polynucleotide fragments 
encoding these polypeptide fragments. This gene maps to chromosomal 8 and 
therefore, may be used as a marker ,n linkage analysis for chromosome 1 8. 

This gene is expressed primarily in adult hypothalamus and to a lesser extent in 
infant brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue s) or cell typc(s) present in a 

biological samnlp ;mH for Hiortn.ww- , 

1 -t,— „ wi ui^dM^N anti conditions, which include but are 

not hmited to, neurodegenerative disorders; endocrine function: and vertigo. Similarly 
polypeptides and an.ibod.es directed to these polypeptides are useful ,n prov.din. 
immunological probes for differential identification of the tissuc(s) or cell ty P e(s^ For 
a number of disorders of the above t.ssues or cells, particularly of the brain CNS and 
peripheral nervous system, expression of this gene at significantly hi S her or lower 
levels may be routinely detected m certain tissues (e.g., cancerous and wounded 
t.ssues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another t.ssue or cell sample taken from an individual having such a disorder relative to 
the standard gene expression level, i.e., the expression level in healthy t.ssue or bodily 
fluid from an individual not having the disorder. 

The t.ssue distribution indicates that polynucleotides and polypeptides 
corresponding to th.s gene are useful for the treatment and diagnosis of 
neurodegenerative disorders: diagnosis of tumors of a bra,n or neuronal origin 
treatments involving hormonal control of the entire body and of homeostases 
behav.oral d.sorders. such as Alzheimer's Disease, Parkinson's Disease. Huntington's 
Disease, sch.zophrcnia. mama, dementia, paranoia, obsessive compulsive disorder and 
pan,c disorder. In addition, the gene or gene product mav also plav a role in the 
treatment and/or detection of developmental disorders associated with the develop.no 
embryo. & 

FEATURES OF PROTEIN ENCODED KY f I \ , vo 
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chromosome 1 , and therefore, may be used as a marker in linkage analysis for 
chromosome 1 . 

This gene is expressed primarily in brain and breast and to a lesser extent in a 
variety of hematopoietic tissues and cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer of the brain and breast; Iymphoproliferative disorders; 
neurodegenerative diseases. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the CNS, breast, and immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and diagnosis of cancer of the 
brain, breast, and hematopoietic system. In addition, it may be useful for the treatment 
of neurodegenerative disorders, as well as disorders of the hematopoietic system, 
including defects in immune competency and inflammation. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and 
immunotherapy targets for the above listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 122 

The translation product of this gene shares sequence homology with an ATP 
synthase, a key component of the proton channel that is thought to be important in the 
translocation of protons across the membrane. 

This gene is expressed primarily in T cell lymphoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, T cell lymphoma. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
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lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution and homology to ATP synthase indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the treatment 
of defects in proton transport, homeostasis, and metabolism, as well as the diagnosis 
and treatment of lymphoma. Because the gene ts expressed in cells of lymphoid origin, 
the natural gene product may be involved in immune functions. Therefore it may be also 
used as an agent for immunological disorders including arthritis, asthma, immune 
deficiency diseases such as AIDS, and leukemia 

FEATURES OF PROTEIN ENCODED BY GENE NO: 123 

This gene maps to chromosome 15, and therefore, may be used as a marker in 
linkage analysis for chromosome 15. 

This gene is expressed primarily in a variety of fetal tissues, including fetal 
liver, lung, and spleen, and to a lesser extent in a variety of blood cells, including 
eosinophils and T cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but arc 
not limited to, cancer (abnormal cell proliferation); T cell lymphomas; and hematopoietic 
disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the fetus and immune system, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this <jene arc useful forth,. m,* lfT - • ' - ' ]: . 

f 1 ; i wilcratu mi, apopto.M.s; or cell mii \ i val. Thus it may be useful m die management and 
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treatment of a variety of cancers and malignancies. In addition, its expression in blood 
cells suggest that it may play additional roles in hematopoietic disorders and conditions, 
and could be useful in treating diseases involving autoimmunity, immune modulation, 
immune surveillance, and inflammation. . 

FEATURES OF PROTEIN ENCODED BY GENE NO: 124 

This gene is expressed primarily in placenta and to a lesser extent in pineal gland 
and rhabdomyosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, w hich include, but are 
not limited to, developmental, endocrine, and female reproductive disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the [insert system 
where a related disease state is likely, e.g., immune], expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 357 as residues: 
Leu-69 to Val-76. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of disorders in 
development. Protein, as well as, antibodies directed against the protein may show 
utility as a tumor marker and immunotherapy targets for the above listed tumors and 
tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 125 

This gene is expressed primarily in benign prostatic hyperplasia. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of benign prostatic hyperplasia. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the reproductive 
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system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid ) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
5 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of benign prostatic 
hyperplasia. Protein, as well as. antibodies directed against the protein may show utility 
10 as a tumor marker and immunotherapy targets for the above listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 126 

This gcne j s expressed primarily in apopioiic T-ceiis and to a lesser extent in 
suppressor T cells and ulcerative colitis. 

15 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases involving premature apoptosis, and immunological and 
gastrointestinal disorders. Similarly, polypeptides and antibodies directed to these 

20 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or ceil type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly hi gher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

25 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of disorders involving 

30 inappropriate levels of apoptosis, especially in immune cell lineages. Because the gene 
is expressed in cells of lymphoid origin, the natural gene product may be involved in 
immune functions. Therefore it may be also used as an agent for immunological 
disorders including arthritis, asthma, immune deficiency diseases (such as AIDS), and 
leukemia 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 127 

This gene is expressed primarily in Raji cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s ) or cell typc(s) present in a 
5 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation and T cell autoimmune disorders. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the immune system, expression of 

10 this gene at significantly higher or lower levels may be routinely detected in certain 

tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

15 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO; 360 as residues: Asp-23 to Gly-29. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of inflammation and T 
cell autoimmune disorders. Because the gene is expressed in cells of lymphoid origin, 

20 the natural gene product may be involved in immune functions. Therefore it may be also 
used as an agent for immunological disorders including arthritis, asthma, immune 
deficiency diseases (such as AIDS), and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 128 

25 The translation product of this gene shares sequence homology with an C 

dedans coding region C47D12.2 of unknown function (See Accession No. 
gnllPIDIe348986). One embodiment for this gene is the polypeptide fragments 
comprising the following amino acid sequence: EDDGFNRSIHEVILKNITWY 
SERVLTEISLGSLLILVVIRTIQYNMTRTRDKYLHTNCLAALANMSAQFRSLHQY 

30 AAQRIISLFSLLSKKHNKVLEQATQSLRGSLSSNDVPLPDYAQDLNVIEEVIRMM 
LEIINSCLTNSLHHNPNLVALLYKRDLFEQFRTHPSFQDIMQNIDLVISFFSSRLL 
QAGS (SEQ ID NO:657); EDDGFNRSIHEVILKNITWYSERVLTEISLGSLLILVV 
(SEQ ID NO:658); RTIQYNMTRTRDKYLHTNCLAALANMSAQFRSLHQYAAQ 
RIISLFSLLSKKHN (SEQ ID NO:659); KKHNKVLEQATQSLRGSLSSNDVPLPDY 

35 AQD (SEQ ID NO:66 1 ); SCLTNSLHHNPNLV Y ALLYKRDLFEQFRTHPSFQD 

IMQNIDLVISFFSSRLLQAGS (SEQ ID NO:660). An additional embodiment is the 
polynucleotide fragments encoding these polypeptide fragments. This gene maps to 
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chromosome 18, and therefore, may be used as a marker in linkage analysis for 
chromosome 1 8. 

This gene is expressed primarily in smooth muscle and to a lesser extent in fetal 

liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissuc(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, atherosclerosis and other cardiovascular and hepatic disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides arc useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the circulatory 
system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily thuds (e.c, 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or ceil sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of circulatory system 
disorders such as atherosclerosis, hypertension, and thrombosis . In addition, the tissue 
distribution indicates that polynucleotides and polypeptides corresponding to this gene 
are useful for the detection and treatment of liver disorders and cancers (e.g. 
hepatoblastoma, jaundice, hepatitis, liver metabolic diseases and conditions that are 
attributable to the differentiation of hepatocyte progenitor cells). In addition the 
expression in fetus would suggest a useful role for the protein product in developmental 
abnormalities, fetal deficiencies, pre-natal disorders and various would-healing models 
and/or tissue trauma. 

I KATIRES OF PROTEIN KNCODED BY CJKNE NO: 129 

The translation product of this gene shares sequence homology with a ribosomal 
protein which is thought to be important in cellular metabolism, in addition to the 
C.elegans protein F40F1 1.1 which does not have a known function at the current time 
(See Accession No. gnIIPIDIe244552 ). Preferred polypeptide fragments comprise the 
following amino acid seouenee- 

■ . . . i\ lu it in .\,M Ki it\ 1 1 M r \ \ I o] Ml >.\t ■ I >! K I If I s \ t [ | f >r \ \ * 
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YIRKYNRFEKRHKNMSVHLSPCFRDVQIGDIVTVGHCRPLSKTVRFNVLKVTK 
AAG TKKQFQKF (SEQ ID NO:663); M A DI QTF R A Y Q KQ FT I FQ N K KR V LLG ET 
GK (SEQ ID NO:664); HCHPPRLSALHPQVQPLREAPQEHVCTPVPL LQGRPDR 
(SEQ IDNO:666); NIGLGFKDTPRRLLRGTYIDKKCPFI GNVSIRGRILSGVVTQ 
5 (SEQ ID NO:669); MKMQRTIVIRRDYLHYIRKYNRFEKRHKNMSVHLSP (SEQ 
ID NO:667); CFRDVQIGDIVTVGECRPLSKTVRFNVEKVTKAAGTKKQFQKF 
(SEQ ID NO:668). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. 

This gene is expressed primarily in Wilm's tumor and to a lesser extent in 

10 thymus and stromal cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases affecting RNA translation. Similarly, polypeptides and 

15 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or ceil type(s). For a number of disorders 
of the above tissues or cells, particularly of the Wilms tumors, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

20 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 362 as residues: 
Thr-1 1 to Asp-20. 

25 T ne tissue distribution and homology to a ribosomal protein indicates that 

polynucleotides and polypeptides corresponding to this gene are useful for diseases 
affecting RNA translation. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 130 

30 The translation product of this gene shares sequence homology with a yeast 

DNA hehcase which is thought to be important in global transcriptional regulation (See 
Accession No. gnllPIDie243594). One embodiment for this gene is the polypeptide 
fragments comprising the following amino acid sequence: IEYDSDWNPTVDQQA 
MDRAHRLGQTKQVTVYRLICKGTIEERILQRAKEKSEIQRMVISG (SEQ ID 

35 NO:670);TRMIDLLEEYMVYRKHTYXRLDGSSKISERRDMVADFQNRNDI 

FVFLLSTRAGGLGINLTAXDTVHF (SEQ ID NO:671); TRMIDLLEEYMVYRK 
HTYXRLDGSSKISERRDM (SEQ ID NO:674): RRDMVADFQNRNDIFVFEL 



BNSD'OCIO . WO l 



WO 98/54963 



PCT/LS98/11422 



10] 



STRAGGLGINLTAXDTVHF (SEQ ID NO:675) . IF Y DS D W N PT V DQQ A MD 
RAHRLGQTKQVTVYRLICKG (SEQ ID NO.676): RLICKGT1EERILQRAK 
EKSEIQRMVISG (SEQ ID NO:67S). An additional embodiment is the polynucleotide 
fragments encoding these polypeptide fragments. 

This gene is expressed primarily in amygdala. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases and disorders of the brain. Similarly, polypeptides and 
ant,bod,es directed to these polypeptides are useful in providing immunological probes 
for differential identification of the t,ssue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the central nervous system, expression of 

this gene at significantly higher or l<-»w,-r ™.„. u., _ , . . .. 

•-■ ■ - — "'"j n'uiuiciy ueiecieu in certain 

tissues (e.g.. cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma 
urine, synovial flu,d or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i e 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. & 



The tissue distribution and homology to a DNA hehcase indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diseases 
affecting RNA transcription, particularly developmental disorders and healing wounds 
since the later are though to approximate developmental transcriptional regulation. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 131 

This gene is expressed primarily ,n prostate and to a lesser extent in amygdala 
and pancreatic tumors. 

Therefore, polynucleotides and polypeptides of the invention arc useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not hunted to. prostate enlargement and gastrointestinal disorders, particularly of the 
pancreas and gall bladder. Similarly, polypeptides and ant.bod.es directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissuc(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particulariv of the reproductive system evprw^,,., „f .......... ,. , . ... 



! i M !. I I 
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the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of prostate diseases. 
5 including benign prostatic hyperplasia and prostate cancer. In addition, the tissue 

distribution in tumors of the pancreas indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and intervention of these tumors, in 
addition to other tissues where expression has been indicated. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
10 immunotherapy targets for the above listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 132 

This gene is expressed primarily in adult lung and to a lesser extent in 
hypothalamus. 

15 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell typc{s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, pulmonary diseases and neurological disorders. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 

20 probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the pulmonary and respiratory 
systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

25 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of pulmonary and 

30 respiratory disorders such as emphysema, pneumonia, and pulmonary edema and 
emboli. In addition, the tissue distribution indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the detection/treatment of 
neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 
Parkinson's Disease, Huntington's Disease, schizophrenia, mania, dementia, paranoia, 

35 obsessive compulsive disorder and panic disorder. In addition, the gene or gene 
product may also play a role in the treatment and/or detection of developmental 
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disorders associated with the developing embryo, sexually-linked disorders, or 
disorders of the cardiovascular system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 133 

5 This gene is expressed primarily in human liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cirrhosis of the liver and other hepatic disorders. Similarly, polypeptides 

10 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the digestive system, expression 
of this gone at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

1 5 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
20 corresponding to this gene are useful for diagnosis and treatment of liver disorders such 
as cirrhosis, jaundice, and Hcpatitus. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tissues. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 134 

This gene is expressed primarily in fetal kidney and to a lesser extent in fetal 
liver and spleen. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
30 biological sample and for diagnosis of diseases and conditions, w hich include, but are 
not limited to, development and regeneration of liver and kidney and immunological 
disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell tvpef s ) J ; or :» m m>ber «>f < h - l ^ ' tl > 1 • • n 

issues ' or bodily fluids (e.g.. serum, plasma, urine. syno\ lal fluid or spinal fluid l or 
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another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 367 as residues: Pro-70 to Arg-77, Tyr- 
5 102 toThr-107. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of diseases of the 
kidney and liver, such as cirrhosis, kidney failure, kidney stones, and liver failure, 
hepatoblastoma, jaundice, hepatitis, liver metabolic diseases and conditions that are 
10 attributable to the differentiation of hepatocyte progenitor cells. In addition the 

expression in fetus would suggest a useful role for the protein product in developmental 
abnormalities, fetal deficiencies, pre-natal disorders and various wouid-healing models 
and/or tissue trauma. 



15 



FEATURES OF PROTEIN ENCODED BY GENE NO: 135 

This gene is expressed primarily in brain, bone marrow, and to a lesser extent in 
placenta, T cell, testis and neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurodegenerative and immunological diseases and cancer. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the nervous and 

25 immune systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

30 individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 368 as residues: Met-1 to His-6. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 

35 Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, obsessive 

compulsive disorder and panic disorder. In addition, the gene or gene product may also 
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play a role in the treatment and/or detection of developmental disorders associated with 
the developing embryo, or sexually-linked disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 136 

5 Translatation product of this gene is homologous to the human WD repeat 

protein HANI 1. Preferred polypeptide fragments comprise the following amino acid 
sequence: 

MSU^GKRKEIYKYliAPW'IA^YAMNWSVr<PI)KRFRLAL(}SFVEliYNNKVQLVC) 
EDEESSEFICRNTFDIIPYPTTKI.MWIPI>rKGVYPDU-ATSGDYLRVWRVGETET 
10 RLECELNNNKNSDFCAPLTSFDWNEVDPYEIXTrSSIDTTCTIWGLFTGQVLCjRV 
NLVSGflVKTQLI/UlDKEVYDIAF : SRACKiGRDMFASVGADGSVRMFDLRHLEH 

STIIYEDPQHHPLLRLCWNKQDPNYLAI'MAMDGMEVVILDVRVPAIILXKJTTIE 
HVSMALI GPH1HPATSALQRMTTRESSG TSSKC^ 

WASTQPELSPSATTTAWRYSECSVGGAVPTROGLLYFLPLPHPQS (SEQ ID 
1 5 NO:679); MSLI IGKRKEI YK YEAPWTVY AMNWS VRPDKRFRLALGSFV 

EEYNNKVQLVGLDEESSEFICRNTFDHPYPTTKLMWIPDTKGVYPDLLATSGDY 
ERVWRVGETETRLECLLNNNKNSDFCAPLTSI-DWNEVDPYLL (SEQ ID 
NO:680); SFDWNEVDPYLEGTSSIDTTCTIWGEE TGQVLGRVNLVSGHVK 
TQLIAHDKEVYDIAFSRAGGGRDMFASVGADGSVRMFDLRHLEHSTIIYEDPQH 

20 HPELRLCWNKQDPNYLATMAMDGMEVVILDVRVPAHLXPGTn (SEQ ID 
NO:68 1 ); VGADGSVRMFDLRHLEHSTII YEDPQHHPLLRLCWNKQDPNYLA 
TMAMDGMEVVILDVRVPAHLXPGTTIEI 1VSMALEGPHIHPATSALQRMTTRLS 
SGTSSKCPEPLRTLSWPTQLXGEINNVQWASTQPELSPSATTTAWRYSECSVG 
G A VPTRQGLL YFLPLPI IPQS (SEQ ID NO:682). Also preferred are polynucleotide 

25 fragments encoding these polypeptide fragments. 

This gene is expressed primarily in placenta, embryo, T cell and fetal lung and 
to a lesser extent in endothelial, tonsil and bone marrow. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

30 biological sample and for diagnosis of diseases and conditions, which include, but arc 
not limited to, immunological and developmental diseases in addition to cancers. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
tvpe(s). For a mnuher of disorder^ Mv. - 1 - * ■ 

lluids (e.g.. serum, plasma, unne. synovial fluid or ^prnal fluid) or another tissue oi 
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cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 369 as residues: Gly-19 to Gln-28, Pro- 36 to Phe-42. 

The tissue distribution in tumors of colon, ovary, and breast origins indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues. Because the gene is expressed in cells of lymphoid origin, the 
natural gene product may be involved in immune functions. Therefore it may be also 
used as an agent for immunological disorders including arthritis, asthma, immune 
deficiency diseases such as AIDS, and leukemia. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 137 

This gene is expressed primarily in TNF and INF induced epithelial cells, T 
cells and kidney. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammatory conditions particularly inflammatory reactions in the 
kidney. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of renal 
system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
30 individual not having the disorder. Preferred epitopes include those comprising a 

sequence shown in SEQ ID NO: 370 as residues: Thr-67 to Gly-72, Gin- 132 to Ala- 
145, Arg 150 to Pro- 157. 

The tissue distribution indicates that the protein products of this gene arc useful 
for treating the damage caused by inflammation of the kidney. 

35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 138 

This gene maps to chromosome 1, and therefore, may be used as a marker in 
linkage analysis tor chromosome 1 (See Accession No. D63485). 

This gene is expressed primarily in breast cancer and colon cancer and to a 
5 lesser extent in thymus and fetal spleen. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancers, especially of the breast and colon tissues. Similarly, 
10 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). Tor 
a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
1 5 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in tumors of colon and breast origins indicates that 
20 polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 

and intervention of these tumors, in addition to other tumors w here expression has been 
indicated. Protein, as well as, antibodies directed against the protein may show utility as 
a tumor marker and/or immunotherapy targets for the above listed tumors and tissues. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 139 

This gene maps to chromosome 17, and therefore, can be used as a marker for 
linkage analysis from chromosome 17. 

This gene is expressed primarily in CD34 positive cells, and to lesser extent in 
activated T-cells and neutrophils. 
^° Therefore, polynucleotides and polypeptides of the invention are usef ul as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunologically related diseases and hematopoietic disorders. Similarly, 

i i : i : i ' \ ■ : .. • : .:c! - ■! I.V. .i's »\ ! I .-!:• ;■.;•':..<. ; , ; ; \ \ ■ ; ; > ■ ; \ , ■ 

and hematopoietic system, expression of this gene at significantly higher or lower levels 
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may be routinely detected in certain tissues (e.g.. cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution in CD34, T-cell and neutrophils indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for treatment 
and diagnosis of hematopoietic disorders and immunologically related diseases, such as 
anemia, leukemia, inflammation, infection, allergy, immunodeficiency disorders, 
arthritis, asthma, immune deficiency diseases such as AIDS. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 140 

This gene was recently cloned by another group, who called the gene 
KIAA0313 gene. (See Accession No. d 102 1609.) Preferred polypeptide fragments 
comprise the amino acid sequence: 

LYATATVISSPSTEXLSQDQGDRASLDAADSGRGSWTSCSSGSHDNIQTIQ 
HQRSWETLPFGHTHFDYSGDPAGLWASSSHMDQIMFSDHSTKYNRQNQSRES 
LEQAQSRASWASSTGYWGEDSEGDTGTIKRRGGKDVSIEAESSSLTSVTTEETK 
PVPMPAHIAVASSTTKGLIARKEGRYREPPPTPPGYIGIPITDFPEGHSHPARKP 
PDYNVALQRSRMVARSSDTAGPSSVQQPHGHPTSSRPVNKPQWHKXNESDPR 
LAPYQSQGFSTEEDEDEQVSAV (SEQ ID NO:683); HMDQIMFSDHSTKYNRQ 
NQSRESEEQAQSRASWASSTGYWGE (SEQ ID NO:684); SVTTEETKPVPMP 
AHIAVASSTTKGLIARKEGRYREPPPTPPGYIGIPITD (SEQ ID NO:685); and 
VALQRSRMVARSSDTAGPSSVQQPHGHPTSSRPVNKPQW 
HKXNESDPRLAPYQSQGF (SEQ ID NO:686). Also preferred are polynucleotide 
fragments encoding these polypeptide fragments. This gene maps to chromosome 4, 
and therefore, may be used as a marker in linkage analysis for chromosome 4 (See 
Accession No. AB00231 1 ). 

This gene is expressed primarily in ovarian cancer, tumors of the Testis, brain, 
and colon. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, ovarian, testicle, brain and colon cancers. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the male and female reproductive systems. 
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expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily iluid from an individual not having the 
disorder. 

The tissue distribution in tumors of colon, ovary, testis, and brain origins 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 141 

This gene is expressed primarily in spleen and colon cancer. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, colon cancer and immunological disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s) For a number of disorders 
of the above tissues or cells, particularly of the gastrointestinal trace and immune 
systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e.. the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

1 he tissue distribution in tumors of colon, ovary, and breast origins indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as. antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tisanes 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 142 

Translation product is homologous to T cell translocation protein, a putative zinc- 
finger factor (See Accession No. 340454), as well as to the G-protein coupled receptor 
TM5 consensus polypeptide (See Accession No. R50734). Preferred polypeptide 
5 fragments comprise the following amino acid sequence: 

CLLFVFVSLGMRCLFWTIVYNVLYLKHKCNTVLLCYHLCSI (SEQ ID NO:687); 

ACSKLIPAFEMVMRAKDNVYHLDCFACQLCNQRXCVGDKFFLKNNXXLCQT 
DYEEGLMKEGYAPXVR (SEQ ID NO:688). Also preferred arc polynucleotide 
fragments encoding these polypeptide fragments. 

1° This gene is expressed primarily in fetal brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disorders including brain cancer. Similarly, polypeptides 

1 5 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the Central Nervous System, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

20 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
25 corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 
Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder and panic disorder. In addition, the gene or gene product may also 
play a role in the treatment and/or detection of developmental disorders associated with 
30 the developing embryo. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 143 

Translation product for this gene has significant homology to the Fas ligand, 
which is a cysteine-rich type II transmembrane protein/tumor necrosis factor receptor 
35 homolog. Mutations within this protein have been shown to result in generalized 
lymphoproliferative disease leading to the development of lymphadenopathy and' 
autoimmune disease (See Medline Article No. 94185175). Preferred polypeptide 
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fragments comprise the following amino acid sequence: 

SALSEPGAPDRRRPCPESYPRRPDDEQWPPPTALCLDYAPLPPSS (SEQ ID 
NO:689). Also preferred are polynucleotide fragments encoding these polypeptide 
fragments (See Accession No. 473565). 
5 I his gene is expressed primarily in osteoblasts, lung, and brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, osteoblast related, pulmonary, neurological, and immunological 

10 diseases. Similarly, polypeptides and antibodies directed to these polypeptides are 

useful in providing immunological probes for differential identification of the tissue(s) 
or cell type( s ). For a number of disorders of the above tissues or cells, particularly of 
the skeletal and nervous systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

15 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 376 as residues: Trp-33 to Thr-40, Lys- 

20 45 to Ile-63. 

The tissue distribution in osteoblasts, lung, and brain combined with its 
homology to the Fas hgand indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and intervention of these tumors, in 
addition to other tumors where expression has been indicated. Protein, as well as, 

25 antibodies directed against the protein may show utility as a tumor marker and/or 

immunotherapy targets for the above listed tumors and tissues. Because the Fas ligand 
gene is known to be expressed in cells of lymphoid origin, the natural gene product 
may be involved in immune functions. Therefore it may be also used as an agent for 
immunological disorders including asthma, immune deficiency diseases such as AIDS 

30 and leukemia, and various autoimmune disorders including lupus and arthritis. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 144 

This gene shares sequence homology with a 21.5 KD transmembrane protein in 
the SEC15-SAP4 intereenic region of vea^r f -\ * - - ; v '-^?m-? p 

■ . ■ - . \ . u t . \ iv iK.M k i i j n if '( ff , i \ s X K( > K k f ■ K \\ \ [ [ i 
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GPLKQIPMNLHMYMAGNTISIFFrMMVCMMAWRPIQALMAISATFKMLESSSQ 
KFLQGLVYLIGNLMGLALAVYKCQSMGLLPTFIASDWLAFIEPPERMEFSGG 
GLLL (SEQ ID NO:691); PVGYLDKQVPDTSVQETDR1LVEKRCWDIALGPLKQ 
IPMNLFI (SEQ ID NO:693); and ATFKMLESSSQKFLQGLVYLIGNLMGLALAV 
YKCQSMGLLPTHASD (SEQ ID NO:692). Also preferred are polynucleotide 
fragments encoding these polypeptide fragments. 

This gene is expressed primarily in osteoclastoma, hemangiopericytoma, liver, 

lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell typc(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, osteoclastoma, hemangiopericytoma, liver and lung tumors. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the above tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the lun^ 
and liver systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosing osteoclastoma, 
hemangiopericytoma, liver and lung tumors. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 145 

Translation product of this gene shares homology with the glucagon-69 gene 
which may indicate this gene plays a role in regulating metabolism. (See Accession No. 
A60318) One embodiment for this gene is the polypeptide fragments comprising the 
following amino acid sequence: 

PTTKLDLMEKKKJ4IQIRFPSFYHKLVDSGRMRSKRETRREDSDTKHNL (SEQ ID 
NO:694). An additional embodiment is the polynucleotide fragments encoding these 
polypeptide fragments. 

This gene is expressed primarily in brain, kidney, colon, and testis. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a ' 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to. brain, kidney, colon, and testicular cancer. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
lor differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the male reproductive system, neurological, 
5 circulatory, and gastrointestinal systems, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 

10 or bodily fluid from an individual not having the disorder. 

The tissue distribution in tumors of brain, kidney, colon, and testis origins, 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as, antibodies directed against the 

1 5 protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues. The tissue distribution indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the detection/treatment of 
neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 
Parkinson's Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, 

20 obsessive compulsive disorder and panic disorder. In addition, the gene or gene 
product may also play a role in the treatment and/or detection of developmental 
disorders associated with the developing embryo, sexually-linked disorders, or 
disorders of the cardiovascular system. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 146 

The translation product of this gene shares sequence homology with goliath 
protein which is thought to be important in the regulation of gene expression during 
development. Protein may serve as a transcription factor. One embodiment for this gene 
is the polypeptide fragments comprising the following amino acid sequence: 
30 TEHIIAVMITLLRGKDILSYLEKjNISVQMT1AVGTRMPPKNT\SRGSIATVSISFIV 
LMIISSAWLIFYFIQKIRYTNARDRNQRRLGDAAKKAISKLTTRTVKKGDKETD 
PDFDHCAVCIESYKQNDVVRILPCKHVFIIKSCVDPWLSHHCTCPMCKLNILKA 
LGIV (SEQ ID NO:695); TEHI1AVMITEERGKD1LS YEEKNIS VQMTIAVGTRMP 
PKNFSRGSEVFVSISFTVI M TISSAWT 1FYF iK} n m Y/ > ' <>- q^'i-i n in, 
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WLSEHCTCPMCKLNILKALGIV (SEQ ID NO:699). An additional embodiment is 
the polynucleotide fragments encoding these polypeptide fragments (See Accession No. 
157535 ). Moreover, another embodiment is the polynucleotide fragments encoding 
these polypeptide fragments: 

MTHF'GTEHIIAVMITELRGKDILSYLEKNISVQMTIAVGTRMPPKNFSRGS 
LVFVSISFIVLMIISSAWLIFYFIQKIRYTNARDRNQRRLGDAAKKAISKLTTRTV 
KKGDKETDPDFDHCAVC1ESYKQNDVVR1LPCKHVFHKSCVDPWLSEHCTCP 
MCKLNILKALGIVPNLPCTDNVAFDMERLTRTQAVNRRSALGDLAGDNSLGLE 
PLRTSGISPLPQDGELTPRTGE1NIAVTKEWFIIASFGLLSALTLCYMIIRATASLN 
ANEVEWF (SEQ ID NO:696):MTHPGTEHIIAVMrTELRGKDILSYLEKNISVQM 
HAVGTRMPPKNFSRGSLVFVSISfTVIJVHISSAWLIFYFIQKIRYTNARDRNQRR 
LGDAAKKAISKLTTRT (SEQ IDNO:700): AAKKAISKLTTRTVKKGDKE 

TDPDFDHCAVCIESYKQNDVVRILPCKUVFHKSCVDPWLSEHCTCPMCKLNIL 
KALGIVPNLPC (SEQ ID NO:701); TQAVNRRSALGDLAGDNSLGLEPLRTSGI 

SPl-PQDGELTPRTGEINIAVTKEWFUASFGLLSALTLCYMIIRATASLNANEVEW 
F (SEQ ID NO:702): PLHGVADHLGCDPQTRFFVPPNIKQWIALLQRGNCTF 
KEK1SRAAFHNAVAVVIYNNKSKEEPVTMTHPGTEHIIAVMITELRGKDILSYLE 
KNISVQMTIAVGTRMPPKNf-SRGSLVFVSISFIVLMIISSAWLIFYFIQKIRYTNA 
RDRNQRRLGDAAKKAISKLTTRTVKKGDKETDPDFDHCAVCIESYKQNDVVRI 
LPCKJ-1VFHKSCVDPWLSEHCTCPMCKLNILKALGIVPNLPCTDNVAFDMERLT 
RTQAVNRRSALGDLAGDNSLGLEPLRTSGISPLPQDGELTPRTGEINIAVTKEW 
FIIASFGLLS ALTEC YMIIRATASLNANEVEWF(SEQ IDNO:703): and 
HGVADHLGCDPQTRFFVPPNIKQWIALLQRGNCTFKEKISRAAFHNAVAVVIY 
NNKSKEE (SEQ ID NO:704). An additional embodiment is the polynucleotide 
fragments encoding these polypeptide fragments. When tested against Jurkat cell lines, 
supernatants removed from cells containing this gene activated the GAS pathway. 
Thus, it is likely that this gene activates immune cells through the JAKS/STAT signal 
transduction pathway. 

This gene is expressed primarily in macrophage, breast, kidney and to a lesser 
extent in synovium, hypothalamus and rhabdomyosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, schizophrenia and cancer. Similarly, polypeptides and antibodies directed 
to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s ) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune and neural system, expression of this gene at 
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significantly higher or lower levels may be routinely detected in certain tissues (e.£., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
5 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to zinc finger protein indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the treatment 
of schizophrenia, kidney disease and other cancers. The tissue distribution in 
macrophage, breast, and kidney origins indicates that polynucleotides and polypeptides 

10 corresponding to this gene are useful for diagnosis and intervention of tumors within 

these tissues, in addition to other tumors where expression has been indicated. Protein, 
as well as, antibodies directed against the protein may show utility as a tumor marker 
and/or immunotherapy targets for the above listed tumors and tissues. Because the gene 
is expressed in cells of lymphoid origin, the natural gene product may be involved in 

15 immune functions. Therefore it may be also used us an agent for immunological 

disorders including arthritis, asthma, immune deficiency diseases such as AIDS, and 
leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 147 

-° The translation product of this gene shares sequence homology with HNP36 

protein, an equilibrative nucleoside transporter, which is thought to be important in 
gene transcription as well as serving as an important component of the nucleoside 
transport apparatus (See Accession No. 1845345). One embodiment for this gene is 
the polypeptide fragments comprising the following amino acid sequence: 

25 MSGQGLAGFFASVAMICAIASGSELSESAFGYFrTACAVIILTIICYLGLPRLEFYR 
YYQQLKLEGPGEQETKLDLISKGEEPRAGKEESGVSVSNSQPTNESHSIKAILK 
NISVLAFSVCFIFTITIGMFPAVTVEVKSSIAGSSTWERYFIPVSCFLTFNIFDWL.G 
RSLTAVFMWPGKDSRWLPSWXLARLVFVPLLLLCNIKPRRYL'rVVFEHDAWFI 
FFMAAFAFSNGYLASLCMCFGPKKVKPAEAETAEPSWPSSCVWVWHWGLFS 

30 PSCSGQLCDKGWTEGLPASLPVCLLPLPSARGDPEWSGGFFF (SEQ ID 
NO:705): MSGQGLAGFFASVAJV1ICAIASGSELSESAFGYFITACAVIILTIIC 
YLGLPRLEFYR Y YQQLKLE GPGEQETKLDLISKGEEPRAGKEESGVS VSNSQ 
PTNESIISI (SEQ ID NO:706); SGVSVSNSQPTNESHSIKAILKNISVI .AFS VCF1 
RITIGMFPAVTVFVKSSIAGSSTVVFmTIPVSrn TFNMFPWJ CI R ^ k« 
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GWTEGLPASLPVCLLPLPSARGDPEWSGGFFF (SEQ ID NO:709). An additional 
embodiment is the polynucleotide fragments encoding these polypeptide fragments. 

This gene is expressed primarily in eosinophils and aortic endothelium and to a 
lesser extent in umbilical vein endothelial cell and thymus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematopoietic disease. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the circular system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to HNP36 protein indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the treatment 
of blood neoplasias and other hematopoietic disease. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 148 

This gene is expressed primarily in breast cancer cell lines, thymus stromal 
cells, and ovary. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, endocrine and female reproductive system diseases including breast 
cancer. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
endocrine system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of endocrine 
disorders. In addition, the tissue distribution in tumors of thymus, ovary, and breast 
origins indicates that polynucleotides and polypeptides corresponding to this gene are 
5 useful for diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 149 

Translation product of this gene has homology to pmtl and pmt 2, two 
conserved schizosaccharoinyces pombe genes. One embodiment for this gene is the 
polypeptide fragments comprising the following amino acid sequence: 

DDDGFEIVPIEDPAK1IRILDPEGLALGAVIASSKKAKRDLIDNSFNRYTFNEDEG 
15 ELPEWFVQEEKQHRIRQEPVGKKEVEHYRKRWRE1NARPIXXXXXXXXXXX 
XXXXXXLEQTRKKAEAVVNTVDIXRTRES (SEQ ID NO:710); 

DDDGFEIVPIEDPAKHRILDPEGLALGAVIASSKKAKRDLIDNSENRYTF (SEQ 
ID NO:71 1); KRWREINARPIXXXXXXXXXXXXXXXXXEEQTRKKAE 
AVVNTVDIXRTRES (SEQ ID NO:712). An additional embodiment is the 
20 polynucleotide fragments encoding these polypeptide fragments (See Accession No. 
e 12 16734V 

This gene is expressed primarily in retina and ovary and to a lesser extent in 
brreast cancer cell, epididymus and osteosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neuronal growth disorders, cancer and reproductive svstem disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or ceil 

30 tvpe(s). Eor a number of disorders of the above tissues or cells, particularly of the 

neural and reproductive system, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an indiv-;, iu->H>'n .m- . i i - i > * 

comprising a sequence shown m SEQ ID N( >: aS2 as residues: Met-] u>GIv-7. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis or treatment of reproductive 
system disease and cancers. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 150 

One embodiment for this gene is the polypeptide fragments comprising the following 
amino acid sequence: 

MIKDKGRARTALTSSQPAHLCPENPLLHLKAAVKEKKRNKKKKTIGSPKRIQS 
PLNNKLLNSPAKTLPGACGSPQKEIDGFLKHEGPPAEKPLEELSASTSGVPGLS 
10 SLQSDPAGCVRPPAPNLAGAVEFNDVKTLLREWITTISDPMEEDILQVVKYCTD 
LIEEKDLEKLDLVIKYMKREMQQSVESVWNMAFDFILDNVQVVLQQTYGSTLK 
VT (SEQ ID NO:713); MIKDKGRARTAETSSQPAHLCPENPLLHLKAAVKE 
KKRNKKKKT1GSPKRIQ (SEQ ID NO:714): KRIQSPLNNKLLNSPAKT 

EPGACGSPQKLIDGFLKHEGPPAEKPLEELSASTSGVPGLSSLQSDPAGCVRPP 
1 5 APNLAGA VEFND VKTLLREWriTISDPM (SEQ ID NO:7 1 5 ). 

TISDPMEEDILQVVKYCTDLIEEKDLEKLDLVIKYMKRLMQQSVE 
SVWNMAFDFILDNVQVVLQQTYGSTI.KVT (SEQ ID NO:716). An additional 
embodiment is the polynucleotide fragments encoding these polypeptide fragments. 
This gene is expressed primarily in 1 2 week embryo and to a lesser extent in 

20 hemangiopericytoma and frontal cortex. 

Therefore, polynucleotides and polypeptides of the invention arc useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, growth disorders and hemangiopericytoma. Similarly, polypeptides and 

25 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the circular and neural system, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g.. cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

30 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 383 as residues: Leu-4 to Lys- 1 1 . 

35 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the treatment of growth disorders, 
hemangiopericytoma and other soft tissue tumors. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 151 

The translation product of this gene has been found to have homology to a 
human DNA mismatch repair protein PMS3. Preferred polypeptide fragments comprise 
5 the following amino acid sequence: FCHDCKFPEASPAMNCEP (SEQ ID NO:7I7). 
Also preferred are polynucleotide fragments encoding these polypeptide fragments (See 
Accession No. R95250). 

This gene is expressed primarily in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, lymphoma, immunodeficiency diseases, and cancers resulting from 
genetic instability. Similarly, polypeptides and antibodies directed to these polypeptides 
are useful in providing immunological probes for differential identification of the 

1 5 tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 

particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

20 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 384 as residues: Met-1 to Lys-6. 

The tissue distribution in neutrophils and the sequence homology indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis of 

25 f Iodgkin's lymphoma, since the elevated expression and secretion by the tumor mass 
may be indicative of tumors of this type. Additionally the gene product may be used as 
a target in the immunotherapy of the cancer. Because the gene is expressed in cells of 
lymphoid origin, the natural gene product may be involved in immune functions. 
I herefore it may be also used as an agent for immunological disorders including 

30 arthritis, asthma, immune deficiency diseases such as AIDS, and leukemia. 

Furthermore, its homology to a known DNA repair protein would suggest gene may be 
useful in establishing cancer predisposition and prevention in gene therapy applications. 

FF \ IT R F*s OF PUOTFfV FVrODFH T>V n \ r \-f* tt- 
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biological sample and tor diagnosis of diseases and conditions, which include, but are 
not limited to, infectious diseases and lymphoma. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
lor differential identification of the tissue(s) or cell type(s). For a number of disorders 
5 of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
10 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment of inflammation and infectious 
diseases. 



15 



FEATURES OF PROTEIN ENCODED BY GENE NO: 153 

One embodiment for this gene is the polypeptide fragments comprising the 
following amino acid sequence: 

MASSVPAGGHTRAGGIFLIGKLDLEASLFKSFQWLPFVLRKKC 

NFFCWDSSAHSLPLUPLSASCSAPACHASDTHLLYPSTRALCPSIFAWLVAPHS 
20 VFRTNAPGPTPSSQSSPVFPVFPVSFMALIVCXLVCC (SEQ ID NO:720); 

MASSVPAGGHTRAGGIFLIGKLDLEASLFKSFQWLPFVLRKKCNFFCWDSSAH 
SLPLHPLSASCSAPACHA (SEQ ID NO:72 1);FAWLVAPHSVFRTNAPGPTPS 
SQSSPVFPVFPVSFMALIVCXLVCC (SEQ ID NO:722). An additional embodiment 
is the polynucleotide fragments encoding these polypeptide fragments. 
- 5 This gene is expressed primarily in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation and infectious disease. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
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epitopes include those comprising a sequence shown in SEQ ID NO: 386 as residues: 
Ser- 1 1 to Pro- 1 7. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment of infectious diseases and 
inflammation. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 154 

This gene is expressed in multiple tissues including ovary, uterus, adipose 
tissue, brain, and the liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limiftvl to ntn 

" ' "vti cancci. similarly, polypeptides and 

ant.bod.es directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the female reproductive system, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution of this gene indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnostic or therapeutic uses ,n 
the treatment of the female reproductive system, obesity, and liver disorders, 
particularly cancer in the above tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 155 

Tins gene maps to chromosome 3. and therefore, may be used as a marker in 
linkage analysis for chromosome 3 (See Accession No. D87452). 

This gene is expressed in multiple tissues including brain, aortic endothelial 
cells, smooth muscle, pituitary, testis, melancytes. spleen, nertrophils. and placenta. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
rcaeents for differential identification <>f ti v > .;. ■ p.... 



brain and the lemale leproductive s> Ml.mii. as well as cardiovascular disorders, such 
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atherosclerosis and stroke. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). for a number oi disorders of the above tissues or cells 
particularly of the central nervous and immune systems, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution suggest that polynucleotides and polypeptides 
corresponding to this gene are useful in treatment/detection of disorders in the nervous 
system, including schizophrenia, neurodegeneration, neoplasia, brain cancer as well as 
cardiovascular and female reproductive disorders including cancer within the above 
tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 156 

The translation product of this gene shares sequence homology with the human 
gene encoding cytochrome b561 (See Accession No. PI 0897). Cytochrome b561 is a 
transmembrane electron transport protein that is specific to a subset of secretory vesicles 
containing catecholamines and amidated peptides. This protein is thought to supply 
reducing equivalents to the intravesicular enzymes dopamine-beta-hydroxylase and 
alpha-peptide amidase. Preferred polypeptides of the invention comprise the amino acid 
sequence: 

MAMEGYWRFLALLGSALLVGr^ 

VLMVTGFVFIQG1AIIVYRLPWTWKCSKLLMKSIHAGLNAVAAILAIISVVAVFE 
NHNVNNIANMYSLHSWVGLIAVICYLLQLLSGFSVFLLPWAPLSLRAFLMPIHV 
YSGIVIFGTVIATALMGLTEKLIFSLRDPAYSTFPPEGVFVNTLGLLILVFGALIF 
WIVTRPQWKRPKEPNSTILHPNGGTEQGARGSMPAYSGNNMDKSDSEL 
NSEVAARKRNLALDEAGQRSTM (SEQ ID NO:724); as well as antigenic fragments 
of at least 20 amino acids of this gene and/or biologically active fragments. Also 
preferred are polynucleotide fragments encoding these polypeptide fragments. 
This gene is expressed primarily in anergic T-cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune system and metabolism related diseases. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
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probes for differential identification of the tissue(s) or cell tvpe(s). For a number of 
disorders ot the above tissues or cells, particularly of the immune system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein product or RNA of this gene is 
useiul lor treatment or diagnosis of immune system and metabolic diseases or 
condit ions including lav-Sachs disease, phenylketonuria, galactosemia, various 
porphyrias, and Hurler's syndrome. 

FKATl RES OF PROTEIN ENCODED BY GENE NO: 157 

T he translation product of this gene shares sequence homology with collagen 
which is important in mammalian development. This gene also shows sequence 
homology with bcl-2. (See Accession No. P80988.) Preferred polypeptide fragments 
comprise the amino acid sequence: PGRAGPSPGLSLQLPARPGHPAGNLAPL 
TSRPQPLCRIPAVPG (SEQ ID NO:725). Also preferred are polynucleotide 
sequences encoding this polypeptide fragment. 

This gene is expressed primarily in HL-60 tissue culture cells and to a lesser 
extent in liver, breast, and uterus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to. immunological diseases, hereditary disorders involving the MHC class 
ot immune molecules, as well as developmental disorders and reproductive disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue( s ) or cell 
type(s). bor a number of disorders of the above tissues or cells, particularly of the 
immune and reproductive system expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial fluid or spinal fluid) or 
another o^ue mt f,*M viipni.. * >i ., T r, . ; • • • 
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comprising a sequence shown in SEQ ID NO: 390 as residues: Ser-39 to Gly-4b, Leu- 
49 to Ala-62. 

The tissue distribution and homology to collagen indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for diagnosis and treatment of 
hereditary MHC disorders and particularly autoimmune disorders including rheumatoid 
arthritis, lupus, scleroderma, and dermatomyositis, as well as many reproductive 
disorders, including cancer of the uterus, and breast tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 158 

This gene is expressed primarily in the amygdala region of the brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissuc(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, a variety of brain disorders, particularly those effecting mood and 
personality. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the brain and central nervous system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and/or diagnosis of a variety of brain 
disorders, particularly bipolar disorder, unipolar depression, and dementia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 159 

This gene is expressed in a variety of tissues and cell types including brain, 
smooth muscle, kidney, salivary gland and T-cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancers of a variety of organs including brain, smooth muscle, kidney, 
salivary gland and T-cells and cardiovascular disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
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ot the above tissues or cells, particularly of the central nervous, urinary, salivary, 
digestive, and immune systems, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution in brain, smooth muscle, and T-cells indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis of 
various neurological, and cardiovascular disorders, but not limited to cancer within the 
above tissues. Additionally the gene product may be used as a target in the 
immunotherapy of the cancer. Because the gene is expressed in cells of lymphoid 
origin, the natural gene product may be involved in immune functions. Thereiore it may- 
be also used as an agent for immunological disorders including arthritis, asthma, 
immune deficiency diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 160 

The translation product of this gene shares sequence homology with collagen 
which is thought to be important in cellular interactions, extracellular matrix formation, 
and has been found to be an identifying determinant in autoimmune disorders. 
Moreover, this gene shows sequence homology with the yeast protein, SIslp, an 
endoplasmic reticulum component, involved in the protein translocation process in 
Yeast Yarrowia lipolytics. (See Accession No. 1052828; see also J. Biol. Chem. 271, 
1 1668-1 1675 ( 1996).) With mouse, this same region shows sequence homology with 
the heavy chain of kinesin. (See Accession No. 2062607.) Recently, suppression of the 
heavy chain of kinesin was shown to inhibits insulin secretion from primary cultures of 
mouse beta-cells. (See Endocrinology 138 (5), 1979-1987 ( 1997).) Moreover, kinesin 
was found associated with drug resistance and cell immortalization. (See 468355.) 
I hus, it is likely that this gene also act as a genetic suppressor elements. 

This gene is expressed primarily in the greater omentum and to a lesser extent in 
a variety of organs and cell types including gall bladder, stromal bone marrow cells, 
lymph node, liver, testes, pituitary, and thymus. 

Therefore, polynucleotides and polypeptides of the invention are ireful as 
reagents for differentia] identification of 'rh< i n^n •■( , > - . .u .. .. 
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Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune and gastrointestinal systems, expression of this gene at significantly higher or 
5 lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
10 comprising a sequence shown in SEQ ID NO: 393 as residues: Asn-27 to Leu-47, Gln- 
81 to Lys-88, Asp-93 to Lys-102, Asn-107 to Leu-1 16, Met- 129 to Glu-141, Glu-150 
to Asp- 157, Lys-176 to Glu-185, Glu-333 to Tyr-349, Cys-393 to Leu-403, Gln-423 
to Gly-429. 

The tissue distribution in within various endocrine and immunological tissues 
1 5 combined with the sequence homology to a conserved collagen motif indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis of various autoimmune disorders including, but not limited to, rheumatoid 
arthritis, lupus erthvematosus, scleroderma, dermatomyositis Because the gene is 
expressed in cells of lymphoid origin, the natural gene product may be involved in 
20 immune functions. Therefore it may be also used as an agent for immunological 

disorders including arthritis, asthma, immune deficiency diseases such as AIDS, and 
leukemia. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 161 

This gene has homology to the tissue inhibitor of metalloproteinase 2. Such 
inhibitors are vital to proper regulation of metalloproteins such as collagenases (See 
Accession No, PI 6368). In addition, this gene maps to chromosome 17, and 
therefore, may be used as a marker in linkage analysis for chromosome 17 (See 
Accession No. PI 6368). 

This gene is expressed primarily in several types of cancer including 
osteoclastoma, chondrosarcoma, and rhabdomyosarcoma and to a lesser extent in 
several non-malignant tissues including synovium, amygdala, testes, placenta. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but arc 
not limited to, various types of cancer, particularly cancers of bone and cartilage, as 
well as various autoimmune disorders. Similarly, polypeptides and antibodies directed 
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to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the musculoskeletal system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.jj., 
cancerous and wounded tissues) or bodily fluids (e.g., scrum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution in various cancers and the sequence homology to a 
collagenase inhibitor indicates that polynucleotides and polypeptides corresponding to 
this gene are useful for detection of various autoimmune disorders such as rheumatoid 
arthritis, lupus, scleroderma, and dermatomyositis. Therefore it may be also used as an 
agent for immunological disorders including aiilnius, asthma, immune deficiency 
diseases such as AIDS, and leukemia 

FEATURES OF PROTEIN ENCODED BY GENE NO: 162 

This gene is homologous to the mitochondrial ATP6 gene and therefore is likely 
a homolog of this gene family (See Accession No. X76197). 
This gene is expressed primarily in brain tissue. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, a variety of brain disorders, including Down's syndrome, depression. 
Schizophrenia, and epilepsy. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the central nervous sysiem. expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution in brain tissue indicates this gene is useful tor diaenosis 
of various neurMl(v T ic:il di^nrd^r^ m 1; -ImJ;.— k - . »: >, 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 163 

This gene is expressed primarily in placenta, neutrophils, and microvascular 
endothelial cells and to a lesser extent in multiple tissues including brain, prostate, 
spleen, thymus, and bone. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neutropenea and other diseases of the immune system. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell tvpe(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g.. cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in placenta indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis various female 
reproductive disorders. Additionally the gene product may be used as a target in the 
immunotherapy of various cancers. Because the gene is expressed in some cells of 
lymphoid and endocrine origin, the natural gene product may be involved in immune 
functions and metabolism regulation, respectively. Therefore it may be also used as an 
agent for immunological disorders including arthritis, asthma, immune deficiency 
diseases such as AIDS, and leukemia. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 164 

This gene is expressed primarily in neutrophils, monocytes, bone marrow, and 
30 fetal liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune system disorders including, but not limited to, autoimmune 
35 disorders such as lupus, and immunodeficiency disorders . Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
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of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g.. 
cancerous and wounded tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e.. the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution in various immune system tissue indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis of various immunological disorders such as Hodgkin's lymphoma, arthritis, 
asthma, immune deficiency diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 165 

The translation product of tins gene shares sequence homology with dystrophin 
which is thought to be defective in both Duchene and Becker Muscular Dystrophy. 
Preferred polypeptide fragments comprise the following ammo acid sequence: 

MKLLGECSSSIDSVKRLEHKLKEEEESLPGFVNLHSTETQTAGVIDRWELLQAQ 

ALSKELRMKQNLQKWQQFNSDLNSIWAWLGDTEEELEQLQRLELSTDIQTIELQ 

IKKLKELQICAVDHRKAIILSINLCSPEFTQADSKESRDLQDRLXQMNGRWDRV 

CSLLEEWRGLLQDALMQCQGFHEMSHGLLEMLENIDRRKNEIVP1DSNLDAEII. 

QDHHKQLMQIKHELLESQLRVASLQDMSCQLLVNAEGTDCLEAKEKVHV1GNR 

LKLLLKEVSRHIKELEKLLDVSSSQQDI.SSWSSADELDTSGSVSPXSGRSTPNR 

QKTPRGKCSLSQPGPS VSSPHSRSTKGGSDSSLSEPXPGRSGRGFLFRVERAA 

LPLQLLLLLLIGLACLVPMSEEDYSCALSNNFARSFHPMLRYTNGPPPL (SEQ ID 

NO:726); MKLLGECSSSIDSVKRLEHKLKEEEESLPGFVNLHSTETQTAGVIDR 

WELLQAQALSKELRMKQNLQKWQQFNSDLNSIWAWLGDTEEELEQLQRLELS 

TDIQTIELQIK (SEQ ID NO:727); KLKELQKA V DHRKAIILS INLCSPEFTQ ADS K 

ESRDLQDRLXQMNGRWDRVCSLLEEWRGLLQDALMQCQGFHFA1SHGLI,LML 

HNIDRRKNEIVPIDSNLDAEILQDHHKQLMQIKUELLESQLRVASLQDMSCOF 
(SEQ ID NO:728); QDMSCQLLVNAEGTDCLEAKEKVH VIGNRLKLLLKEVS 

RHIKELEKLLDVSSSQQDLSSWSSADELDTSGSVSPXSGRSTPNRQK I PRGKCS 
LSQPGPSVSSPIIS (SEQ ID NO:729); DSSLSEPXPGRSGRGFLFRVLRAAL 

PLQLLLLLLIGLACLVPMSEEDYSCALSNNFARSFHPMLRYTNGPPPL (SEQ ID 
NO:730). Also preferred are polynucleotide fragments encoding these polypeptide 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, musculoskeletal disorders including Muscular Dystrophy and 
cardiovascular diseases. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the muscle tissues, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution and homology to dystrophin indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis and treatment of Muscular Dystrophy and other muscle disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 166 

This gene is expressed primarily in human cerebellum. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the central nervous system, including Alzheimer's Disease, 
Parkinson's Disease, ALS, and mental illnesses. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the central nervous system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 399 as residues: Pro-20 to Gly-26, Leu-37 to Pro-42, His-57 to Gly-63. 

The tissue distribution indicates that the protein products of this gene are useful 
for treatment/diagnosis of diseases of the central nervous system and may protect or 
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enhance survival of neuronal cells by slowing progression of neurodegenerative 
diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 167 

5 Preferred polypeptides encoded by this gene comprise the following amino acid 

sequence: 

MKLLICGNYLAPSHSESSRRCCLLCFYPLCLEINFC1MKVFLSMPFLVLFQ 
SLIQED (SEQ ID NO:731). Polynucleotides encoding such polypeptides are also 
provided. This gene is believed to reside on chromosome 15. Therefore polynucleotides 

10 derived from this gene are useful in linkage analysis as chromosome 15 markers. 

This gene is expressed primarily in human testes tumor and to a lesser extent in 
normal human testes. 

Therefore, polynucleotides and polypeptides of me invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

15 biological sample and for diagnosis of diseases and conditions, which include, but arc- 
not limited to, diseases of the testes, particularly cancer, and other reproductive 
disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

20 the male reproductive tissues, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 

25 fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for treatment/diagnosis of testicular diseases including cancers. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 168 

W This gene is expressed primarily in fetal liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to conditions -ifTerMn.* Km- -r, 1T >, --t ; \ , ,.i . t - : 

typcis). h>r a number of disorders oi the above tissues or eelis. particularly of die 



WO 98/54963 



PCT/US98/11422 



132 



hepatic system, and fetal hematopoietic system, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
:> relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO; 401 as residues: His-7 toTrp-17, 
Leu- 19 to Lys-27, Pro-33 to Gly-44, Lys-68 to Gly-74, Lys-85 to Cys-95. 

The tissue distribution indicates that the protein products of this gene are useful 
10 for treatment/diagnosis of diseases of the developing liver and hematopoietic system, 
and act as a growth differentiation factor for hematopoietic stem cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 169 

The polypeptide encoded by this gene is believed to be a membrane bound 
1 5 receptor. The extracellular domain of which is expected to consist of the following 
amino acid sequence: 

RILLVKYSANEENKYDYLPTrVNVCSELVKLVFCVLVSFCVIKKDHQSRNLKY 

ASWKEFSDFMKWSIPAFLYFLDNLIVFYVLSYLQPAMAVIFSNFSIITFALLFRIV 

LKXRLNWIQWASLLTLFLSIVALTAGTKTLQHNLAGRGFHHDAFFSPSNSCLL 

20 FRNECPRKDNCTAKEWTFPEAKWNTTARVFSHIRLGMGHVLIIVQCFISSMANI 
YNEKILKEGNQLTEXIFIQNSKLYFFGILFNGLTLGLQRSNRDQIKNCGFFYGH 
S (SEQ ID NO:732). Thus, preferred polypeptides encoded by this gene comprise the 
extracellular domain as shown above. It will be recognized, however, that deletions of 
either end of the extracellular domain up to the first cysteine from the N-terminus and 

25 the first cysteine of the C-tcrminus, is expected to retain the biological functions of the 
full-length extracellular domain because the cysteines are thought to be responsible for 
providing secondary structure to the molecule. Thus, deletions of one or more amino 
acids from either end (or both ends) of the extracellular domain are contemplated. Of 
course, further deletions including the cysteines are also contemplated as useful as such 

30 polypeptides is expected to have immunological properties such as the ability to evoke 
and immune response. Polynucleotides encoding all of the foregoing polypeptides are 
provided. 

This gene is expressed primarily in human osteoclastoma and to a lesser extent 
in hippocampus and chondrosarcoma. 
35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, cancers, particularly those of the bone and connective tissues. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes lor differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the skeletal system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g.. cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 402 as residues: Met-1 to Cys-6, Ala-41 to Tyr-49, Lys-76 to Lys-84. 

The tissue distribution indicates that the protein products of this gene are useful 
for diagnosis of cancers of the bone and connective tissues, and may act as growth 
factors for cells involved in bone or connective tissue growth. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 170 

Preferred polypeptides encoded by this gene comprising the following amino 
acid sequence: 

NSVPNLQTLAVLTEAIGPEPAIPRXPREPPVATSTPATPSAGPQPLPTGTV 
20 LVPGGPAPPCLGEAWALLLPPCRPSLTSCFWSPRPSPWKETGV (SEQ ID 
NO:733). Polynucleotides encoding such polypeptides are also provided herein. 
This gene is expressed primarily in hematopoietic progenitor cells. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, d iseases of the blood including cancer and autoimmune disorders 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
30 blood/circulatory system, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
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The tissue distribution indicates that the protein products of this gene are useful 
tor diagnosis of diseases involving growth differentiation of hematopoietic cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 171 

Preferred polypeptides encoded by this gene comprise the following amino acid 
sequences: ALQLAFYPDAVEEWLEENVHPSLQRLQXLLQDLSEVSAPP (SEQ ID 
NO:734); and/or CHPPALAGTLLRTPEGRAHARGLLLEAGGA (SEQ ID NO:735). 
Polynucleotides encoding such polypeptides are also provided. The protein product of 
this gene shares sequence homology with metallothionines. Thus, polypeptide encoded 
by this gene are expected to have metallothionine activity, such activities are known in 
the art and described elsewhere herein. 

This gene is expressed primarily in kidney cortex. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the kidney including cancer and renal dysfunction. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the renal system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 404 as residues: Ser-47 to Gln-52. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment/diagnosis of diseases of the kidney 
including kidney failure. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 172 

This gene is expressed primarily in 12 week old early stage human. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
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differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the developing embryo, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g.. cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SKQ ID 
NO: 405 as residues: Gln-31 to Thr-43, GIy-51 to Ser-58. Pro-65 to Pro-72. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment/diagnosis of developmental 
problems with fetal tissue. The gene may be involved in vital organ development in the 
early stage, especially hematopoiesis, cuidiovascuiar system, and neurai development. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 173 

The translation product of this gene shares sequence homology with TGN38, an 
integral membrane protein previously shown to be predominantly localized to the trans- 
Golgi network (TGN) of cells. 

This gene is expressed primarily in developing embryo and to a lesser extent in 
cancer tissues including lymphoma, endometrial, protate and colon. 

Iherefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to. developmental abnormalities and cancer. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the developing fetus, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g.. cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e.. the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SFQ ID 
\'CV iv i . i • Hi . r.< t,, v... T-i p, , v ■> , i -i, (,i p v. 
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diagnosis of cancers and developmental abnormalities where aberrant expression relates 
to an abnormality. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 174 

The translation product of this gene shares sequence homology with a dnaJ heat 
shock protein from E. coli which is allelic to sec63, a gene that affects transit of nascent 
secretory proteins across the endoplasmic reticulum in yeast. 

T his gene is expressed primarily in Hodgkin's lymphoma and to a lesser extent 
in testes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, w hich include, but are 
not limited to. cancer. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell typc(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g.. cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 407 as residues: Thr-13 to Trp-21. Aro- 
74toAsp-8L 

The tissue distribution and homology to dnaJ indicates that polynucleotides and 
polypeptides corresponding to this gene are useful as a diagnostic for cancer including 
Hodgkin's lymphoma. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 175 

This gene is expressed primarily in endothelial cells and to a lesser extent in 
bone marrow stromal cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases involving angiogenic abnormalities including diabetic 
retinopathy, macular degeneration, and other diseases including arteriosclerosis and 
cancer. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 
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type(s). For a number of disorders of the above tissues or cells, particularly of the 
vascular system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for treating diseases where an increase or decrease in angiogenesis is indicated and as a 
factor in the wound healing process. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 176 

I he tmncl'lf ir\r» T-»ri"n-l»i,-t ,*ftUl,- ~ ,.1 _ t , . . . _ _ 

— f.-vvuv.i wi mi.! Miojc.-. sequence nornoiogy with MA l b! 

(mouse) which is thought to be important in regulating chloride conductance in cells 
(particularly in the breast) by modulating the response mediated by cAMP and protein 
kinase C to extracellular signals. 

This gene is expressed primarily in amniotic cells and hematopoeitic cells 
including macrophages. Neutrophils, T cells. TNF induced aortic endothelium and to a 
lesser extent in testes, TNF induced epithelial cells, and smooth muscle. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to. inflammatory responses mediated by T cells, macrophages, and/or 
neutrophils particularly those involving TNF, and also cancer. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the immune system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g.. cancerous and wounded tissues) or bodily fluids (e.g.. serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
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responses to cytokines such as TNF and thus modifying the duration and/or severity of 
inflammation. Polynucleotides and polypeptides derived from this gene are thought to 
be useful in the diagnosis and treatment of cancer. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 177 

This gene is expressed primarily in endothelial cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, vascular restenosis. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the vascular system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

15 cancerous and w ounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid ) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

20 corresponding to this gene are useful for treating diseases associated with vascular 
response to injury such as vascular restenosis following angioplasty.. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 178 

One embodiment of the claimed invention comprises: 

25 MRPDWKAGAGPGGPPQKPAPSSQRKPPARPSAAAAAIAVAAAEEERRLRQRN 
RLRLEEDKPAVERCLEELVFGDVENDEDALLRRLRGPRVQEHEDSGDSEVENEA 
KGNFPPQKKPVWVDEEDEDEEMVDMMNNRFRKDMMKNASESKESKDNLKK 
RLKEEFQHAMGGVPAWAETTKRKTSSDDESEEDEDDLLQRTGNHSTSTSLPRG 
ILKMKNCQHANAERPTVARISICAVPSRCTDCDGCWD (SEQ ID NO:737); or 

30 CLEELVFGDVENDEDALLRRLRGPRVQEHEDSGDSEVENEAKGNFPPQKKPV 
WVDEEDEDEEMVDMMNNRFRKI)MMK^ 

GVPAWAETTKRKTSSDDESEEDEDDLLQRTGNFISTSTSLPRGILKMKNCQHA 
NAERPTVARISICAVPSRCTDCDGC (SEQ ID NO: 738). LKEKIVRSFEVSPDGS 
FLLLNGIAGYLHLLAMKTKELIGSMKINGRVAASTFSSDSKKVYASSGDGEVYV 
35 WDVNSRKCLNRFVDEGSLYGLSIATSRNGQYVACGSNCGVVNIYNQDSCLQE 
TNPh^IKAlMNLVTGVTSLTFNPTTEILAIASEKMKEAVRTVHLPSCTVFSNFPVI 
KNKNISHVHTMDFSPRSGYFALGNEKGKALMYRLHHYSDF (SEQ ID NO:739); 
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and/or KINGRVAASTFSSDSKKVYASSGDGEVYVWDVNSRKCLNRFVDEGSL 
YGLSIATSRNGQYVACGSNCGVVNIYNQDSCLQHTNPKPIKAIMNLVTGVTSLT 
FNTTTEILAIASHKMKtiAVRLVHLPSCTVFSNFPVIKNKNISHVHTMDrSPRSG 
YFALGNEKGKAL (SEQ ID NO: 740). 
5 This gene is expressed primarily in epidydimus and endometrial tumors and to a 

lesser extent in T cell lymphoma and cell lines derived from colon cancer. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, hut are 

10 not limited to, tumors of the reproductive organs including testis and endometrial cells. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
tyne(s) For a number of disorders of the above tissues or cells, particularly ot the 
reproductive system, expression of this gene at significantly higher or lower levels may 

15 be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 

20 sequence shown in SFQ ID NO: 411 as residues: Ser-67 to Lys-72, Val-87 to Leu-93, 
Tyr-128 to Pro- 141, Asp-204 to Gly-210. 

The tissue distribution indicates that the protein products of this gene are useful 
lor treating tumors of the endometrium or epithelial tumors of the reproductive system. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 179 

Preferred polypeptides encoded by this gene comprise the following amino acid 
sequence: 

MRIEQLILLALATGEVGGETRIIKGFFX^KLHSQ 
WKLTAAHCLKPRYIVIILGQHNXQKEEG^ 
30 RND1MLV K^lASPVSITWAVRPL'rLSSRCVlVXGTSCSFPAGAARPDPSYACLl'PC 
IMPTSPSLSTRSVRTPTPATSQTPWCVPACRKGARTPARVTPGALWSVTSLFKA 
LSPGARIRVRSPFSLVSTRKSANMWTGSRRR (SFQ ID NO:741); ETRIIKGFFC 

KLHSQPWQAALFEKTRLLCTjATLIAPRWLl/rAAHC'LKF^RYIVHLGOHNLQKEF: 
GCFQTRTATFSFPHPGFNNSF P\f K OMR VHTM T Vk'M ^ rr^- , ^ t , P t > t - 
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SRRR (SEQ ID NO:742): or CKLHSQPWQAALFEKTRLLCGATLIAPRWLLT 
AAHCLKPRYIVHLGQHNLQKEEGCEQTRTATESFPHPGFNS 
(SEQ ID NO:743). The translation product of this gene shares sequence homology 
with ncuropsin a novel serine protease which is thought to be important in modulating 
extracellular signaling pathways in the brain. Owing to the structural similarity to other 
serine proteases the protein products of this gene are expected to have serine protease 
activity which may be assayed by methods known in the art and described elsewhere 
herein. 

This gene is expressed primarily in endometrial tumor and to a lesser extent in 
colon cancer, benign hypertrophic prostate, and thymus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue! s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to. cancers of the endometrium or colon and benign hypertrophy of the 
prostate. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the urogenital or reproductive systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g.. cancerous and wounded 
tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e.. the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 412 as residues: Gly-12 to Ser-22. Pro- 
25 34 to Ser-53. 

The tissue distribution and homology to serine proteases indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosing 
or treating hperproliferative disorders such as cancer of the endometrium or colon and 
hyperplasia of the prostate. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 180 

Preferred polypeptide encoded by this gene comprise the following amino acid 
sequence: VLQGRYFSPILEMRRLRPEGXXNLPGGSRAQKEPRQDLTLVLWPHC 
PHFAMTRSYVPTKQCMVQGSFYCIFIFKGPVQNWC (SEQ ID NO:744). 
35 Polynucleotides encoding such polypeptide are also provided. 

This gene is expressed primarily in fetal brain 
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There* ore, polynucleotides and polypeptides ot the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, identifying and expanding stem cells in the CNS. Similarly, polypeptides 
5 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the nervous system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
10 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder 

The tissue distribution indicates that the protein products of this gene are useful 
15 tor detecting and expanding stem cell populations in the (or of the) central nervous 
system. 

FEATURES OF PROTEIN ENCODED BY CENE NO: 181 

This gene is expressed primarily in early stage human brain and a stromal cell 

20 line. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities of the CNS. Similarly, polypeptides and 

25 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the central nervous system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

30 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SHQ ID 
NO- 414 a^ residues- c;!n P '<> OIn \n r ;in ^ \ ]>, , ao 
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arc useful for diagnosing or treating developmental abnormalities of the centra! nervous 
system. 

FEATURES OF PROTEIN ENCODED BY CENE NO: 182 

Preferred polypeptides encoded by this gene comprise the following amino acid 
sequence: 

MPHDQVNPELHDFMQSAEVGTIFALSWLITWFGHVLSDFRUVVRLYDF 
FLACHPLMPIYFAAVIVL^REQBVLDCDCDMASVHHLLSOIPQDLPYETLISRXF 
TFLFSFPHPNLLGRPLPNSKLRGRQPLLSKTLSWHQPSRGLIWCCGSGXRGLL 
RPEDRTKDVLTKPRTNRFVKLAVMGLTVAIX^AAALAVVKSALEWAPKFQLOL 
FP (SEQ ID NO:745); or CPEFFIPATLPCPFVFAFrSEASSRA YLTQRGPGGLAQ 
NLMPLPVGFWMCJSLPPPWCWRKWVSEACSCFC (SEQ ID NO:746) These 
polypeptides are structurally similar to various TGF-beta family members. Thus, this 
polypeptide is expected to have a variety of activities in the modulation of cell growth 
and proliferation. 

This gene is expressed primarily in osteoclastoma, microvascular endothelium, 
and bone marrow derived cell lines. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematological diseases particularly involving aberrant proliferation of 
stem cells. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 415 as residues: Ser-33 to Ala-39. 

The tissue distribution indicates that the protein products of this gene is useful 
lor treating disorders of the progenitors of the immune system. Applications include in 
vivo expansion of progenitor cells, ex vivo expansion of progenitor cells, or the 
treatment of tumors of the circulatory system, such as lymphomas. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 183 

This gene maps to chromosome 17 and therefore, polynucleotides of the 
invention can be used in linkage analysis as a marker for chromosome 17. In specific 
embodiments, polypeptides of the invention comprise the sequence: 
5 GFGSVSAAGRRSGGTWQPVQ (SEQ ID NO:747); PGGLAVGSRWWSRSLT 
(SEQ ID NO:748); LEPSRQRRPRRRGGTSRPETDQRAKCWRQL (SEQ ID 
NO:749); and/or VCLRCQNRMEN (SEQ ID NO:750). In further specific 
embodiments, polypeptides of the invention comprise the sequence: MAACTARRPGR 
GQPLVVPVADXGPVAKAALCAAXAGAFSPASTTTTRRHLSSRNRPEGKVLETV 

10 GVFEVPKQNGKYETGQLFLHSIFGYRGVVLFPWQARLXDRDVASAAPEKAEN 
PAGUGSKEVKGKTHTYYQVLIDARDCPHISQRSQTEAVTFLANHDDSRALYAIP 
GLDYVSHEDILPYTSTDQVPIQHELFERFLLYDQTKAPPFVARETLRAWQEKNH 
PWLELSDVHRETrENlRvTVIPFYMGMREAQNSHVYWWRYCIREENEDSDVVQ 
LRERHWRIFSESGTLETVRGRGVVGREPVLSKEQPAFQYSSHVSLQASSGHMW 

15 GTFRFERPDGSHFDVRIPPFSLESNKDEKTPPSGLHW (SEQ ID NO:751 ); 
MAACTARRPGRGQPLVVPVADXGPVAKAAECAA (SEQ ID NO:752); 
VLETVGVFEVPKQNGKYETGQLFLHSIFGYRGVVL (SEQ ID NO:757); 
GLDYVSHEDILPYTST (SEQ ID NO:758); DVHRETTENIRVTVIPFYM (SEQ ID 
NO:759); WWRYCIRLENLDSDVVQLRER (SEQ ID NO:760); and/or PAFQYSS 

20 HVSLQASSGHMWGTFRFER (SEQ ID NO:761). Polynucleotides encoding these 
polypeptides are also encompassed by the invention. 

This gene is expressed primarily in gall bladder, prostate, and fetal brain, and to 
a lesser extent in a few tumor and fetal tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, growth related disorders such as cancers. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

30 of the above tissues or cells, particularly of the prostate, gall bladder, and fetal brain, 

expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of growth-related 
disorders, such cancers. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 184 

In specific embodiments, polypeptides of the invention comprise the 
sequence :SLCCPEGAEGC (SEQ ID NO:762) and/or QLKKTHYDRPCP (SEQ ID 
NO:763). Polynucleotides encoding these polypeptides are also encompassed by the 
invention. 

10 This gene is expressed primarily in stromal cell, tonsil and glioblastoma and to 

a lesser extent in some other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

15 not limited to, immune and inflammatory disorders and glioblastoma. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the stromal cells, 
tonsil, and glioblastoma expression of this gene at significantly higher or lower levels 

20 may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Additionally, it is believed that the 

25 product of this gene regulates pancreatic cell differentiation into beta cells. Accordingly, 
polynucleotides and polypeptides of the invention are useful in the treatment of insulin- 
dependent diabetes mellitus and associated conditions e.g. pancreatic hypofunction and 
the prevention, as well as the treatment of undifferentiated type pancreatic cancers. 
Preferred epitopes include those comprising a sequence shown in SEQ ID NO: 417 as 

30 residues: Pro-27 to Ala-32. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of immune and 
inflammatory disorders and glioblastoma. 

35 FEATURES OF PROTEIN ENCODED BY GENE NO: 185 

This gene is expressed primarily in hepatocellular carcinoma and to a lesser 
extent in other tissues. 



BNSDOOD .;Wu 9854963 A 2 I 



WO 98/54963 



PCT/US98/11422 



145 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, liver diseases. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the liver, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 418 as residues: Ciiy-32 to Lys-39. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of liver diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 186 

This gene is expressed primarily in hippocampus and to a lesser extent in other 

tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, ncutronal disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the hippocampus, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neuronal disorders. 

ies^ei extent in osteoclastoma and other tissues. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell typc(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, bone-related disorders and neuronal diseases. Similarly, polypeptides 
5 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the bone, ostocclast, and 
hippocampus, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 

10 fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e.. the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

15 corresponding to this gene are useful for diagnosis and treatment of bone related 
disorders and neuronal diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 188 

This gene maps to chromosome 4 and therefore polynucleotides of the invention 
20 can be used in linkage analysis as a marker for chromosome 4. 

This gene is expressed primarily in neuronal tissues such as hippocampus, 
spinal cord, and hypothalamus and to a lesser extent in a few other tissues such as 
ovary. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neuronal diseases. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
30 tissues or cells, particularly of the neuronal tissues, expression of this gene at 

significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
35 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neuronal disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 189 

This gene maps to chromosome 10, therefore, polynucleotides of the invention 
can he used in linkage analysis as a marker for chromosome 10. 

This gene is expressed primarily in neuronal tissues and immune tissues, and to 
a lesser extent in a few other tissues such as skin tumor, lung etc. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neuronal and immune-related disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

, w v , ( uvuiai ty ui me ucuiwiiai anu immune-related tissues, 

expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spina] fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 422 as residues: Pro- 19 to Asp-25. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neuronal and 
immune-related disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 190 

The translation product of this gene shares sequence homology with human 
N33, a gene located in a homozygously deleted region of human metastatic prostate 
cancer which is thought to be important in prevention of prostate cancer. In specific 
embodiments, polypeptides of the invention comprise the sequence: 
AQRKKEMVLSEKVSQLMFW1NKRPVIRMNGDKFRRLVKAPPRNYSV1VMFTA 
LQLHRQCVVCKQADEEFQILANSWRYSSAFrNRIFF'AMVDFDEGSDVFQMLNM 
NSAPTFINFPAKGKPKRGDTYELQVRGFSAEQIARWIADRTDVNIRVIRPPNMA 
ARWRFWCVSVT (SEQ ID NO:765); MVVALLIVCDVPSAS (SEQ ID NO:766); 
AORK KFMYI SFKVSfM ( SFO JH ND-^ 1 \- MFYVTV k P PVH> \ tvr ; t u* r 

' ^ lvi[ [ * 1 >t n .s< > \) > i '! i >}.(. ,>| )\ i ( )\\[ .\\t\s,\PI H\f.f>.\k 

C iKP ( SFQ II) N();770): KR( il )'\\ FFQYRGFS AEQI ARWIADRTI )VN IR VIRPPN 
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(SEQ ID NO:771 ); and/or YAGPLMLGLLLAVIGGLVYLRJ^VIWNFSLIKLDGLLOI 
CVLCLL (SEQ ID NO:772). Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

This gene is expressed primarily in infant adrenal gland prostate cell line and to 
5 a lesser extent in a few other tissues like liver, smooth muscle etc. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, prostate cancer and endocrine disorders. Similarly, polypeptides and 

10 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the prostate and adrenal gland, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

15 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 423 as residues; Pro- 34 to Gly-43, Arg- 1 1 3 to Pro- 1 20. 

20 The tissue distribution and homology to N33 indicates that polynucleotides and 

polypeptides corresponding to this gene are useful for diagnosis and treatment for 
prostate cancer and endocrine disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 191 

25 This gene is expressed primarily in T cell and to a lesser extent in fetal lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 

30 these polypeptides are useful in providing immunological probes for differential 

identification of the tissue(s ) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial fluid or spinal 

35 fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
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or bodily fluid from an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO: 424 as residues: Trp-3 to Phe- ( >. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of immune disorders. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 192 

This gene maps to chromosome 6, therefore, polynucleotides of the invention 
can be used in linkage analysis as a marker for chromosome 6. Neural activity and 
neurotrophics induce synaptic remodeling in part by altering gene expression. This 

10 gene is believed to be a glycosylphoshatidyhnositol-anchored protein encoded by a 
hippocampal gene and to possess neural activity. This molecule is believed to be 
expressed in postmitotic-differentiating neurons of the developing nervous system and 
neuronal structures associated with plasticity in the adult. Message of this gene is 
believed to be induced by neuronal activity and by the activity-regulated neurotrophics 

15 BDNF and NT-3. The product of this gene is believed to stimulate neunte outgrowth 
and arborization in primary embryonic hippocampal and cortical cultures and to act as a 
downstream effector of activity-induced neurite outgrowth. In specific embodiments, 
polypeptides of the invention comprise the sequence: DAVFKGFSDCLLKLGDS (SEQ 
ID NO:773); CQHGAKDMWDKLRKFSKNLN (SEQ ID NO:774 ); 

20 VLLVSLSAALATWLSF (SEQ ID NO:775); MGLKLNGRYISLILAVQIAYLVQAVR 
AAGKCDAVFKGFSDCLLKLGDS (SEQ ID NO:776): PA AWDDKTNIKTVCTYW 

EDFHSCTVTALTDCQEGAKDMWDKLRKESKNLNIQGSLFELCGSGNCiAAGSL 
LPAFPVLLVSLSAALATWLSF (SEQ ID NO:777); and/or MGLKLNGRYISLILA 

VQIAYLVQAVRAAGKCDAVFKGFSDCLLKLGDSXXXXXPAAWDDKTNIKTVC 
25 TYWEDFHSCTV rALTDCQEGAKDMWDKLRKESKNLNIQGSL.FELCGSGNGAA 
GSLLPAFPVLLVSLSAALATWLSF (SEQ ID NO:778). Polynucleotides encoding 
this polypeptide are also encompassed by the invention. 

This gene is expressed primarily in human placenta, endometrial tumor and 
tissues of the central nervous system (CNS). 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, relating to reproductive disorders, cancers and neurological diseases. 
Similarlv. polypeptides and anti^^di^ ),,.. t. i f .i, . . ... ,t, i .. > 

reproductive and neurological disorders, expression of this gene at significantly higher 
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or lower levels may be routinely detected in certain tissues (e.g.. cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO: 425 as residues: Asp-47 to Asp- 
63, His-75 to Tyr-80, Pro-83 to Tyr-89. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of reproductive 
disorders such as endometrial tumors. Expression of this gene in tissues of the CNS 
and its strong homology to Neuritin suggest that the protein product from this gene may 
also be used in the treatment and diagnosis of neurological disorders and in the 
regeneration of neural tissues, e.g., following injury. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 193 

The translation product of this gene shares sequence homology with tenascm 
which is thought to be important in development. The translation product of this gene is 
believed to be a ligand of the fibroblast growth factor family. FGF hgand activity is 
known in the art and can be assayed by methods known in the art and disclosed 
20 elsewhere herein. 

This gene is expressed primarily in endometrial tumors, and other types of 

tumors. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancers. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the cancer tissues, expression of this gene at significantly higher or lower 

30 levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 

35 comprising a sequence shown in SEQ ID NO: 426 as residues: Gly-29 to Glu-34, Arg- 
71 to Arg-76, Thr-176 to Cys-182, Gly-184 to Glu-199. 
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The tissue distribution and homology to tcnascin indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for diagnosis and treatment of 



cancers. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 194 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MNSAAGFSHLDRRERVLKLGESFEKQPRCASTLC (SEQ ID NO:779). 
Polynucleotides encoding these polypeptides are also encompassed by the invention. 

This gene is expressed primarily in fetal human lung and neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

not limited to. lune develonment 

1 r v w,.^!^!^. ^iiniicujv, }A;i)|jcpuucs ana 

antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the respiratory system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g.. cancerous and wounded tissues) or bodily fluids (e.g.. serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e.. the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in fetal lung and neutrophils indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of lung and immunity related diseases, for example, lung cancer, viral, 
iungal or bacterial infections (e.g. lesions caused by tuberculosis), inflammation (e.g. 
pneumonia), metabolic lesions etc. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 195 

This gene is expressed primarily in breast lymph node. 

Therefore, polynucleotides and polypeptides of the invention arc useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to. immunnl disorder Similarly nolvr^p'i.f.-s -. n .| -.n«ih,> i;.... i;... . i- 
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significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e.. the expression level 
5 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene arc useful for the diagnosis and treatment of immunal 
disorders. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 196 

This gene maps to chromosome 5 and accordingly, polynucleotides of the invention can 
be used in linkage analysis as a marker for chromosome 5. The translation product of 
this gene shares sequence homology with human M-phase phosphoprotein 4 which is 
thought to be important in phosphorylation and signal transduction processes. In 

15 specific embodiments, polypeptides of the invention comprise the sequence: 
TIYPTEEELQAVQKIVSITERALKLVSD (SEQ ID NO:780); RALKGVLRV 
G VLAKGLLLRGDRNVNLVLLC (SEQ ID NO:78 1 ); ALAALRHAKWFQARAN 
GLQSCVIIIRLLRDLCQRVPTWS (SEQ ID NO:782); GDALRRVFECISSGIIL (SEQ 
ID NO:783); LAFRQIHKVLGMDPLP (SEQ ID NO:784); and/or TIYPTEEELQAVQ 

20 KIVSITERALKLVSDSLSEHEKNKNKEGDDKKEGGKDRALKGVLRVGVLAKG 
LLLRGDRNVNLVLLCSEKPSKTLLSRIAENLPKQLAVISPEKYDIKCAVSEAAIIL 
NSC:VEPKMQVTITLTSPIIREENMREGDVTSGMVKDPPDVLDRQKCLDALAALR 
HAKWFQARANGLQSCVIIIRILRDLCQRVPTWSDFPSWAMELLVEKAISSASSP 
QSPGDALRRVFECISSGIILKGSPGLLDPCEKDPFDTLATMTDQQREDITSSAQFA 

25 LRLLAFRQIHKVLGMDPLPQMSQRFNIHNNRKRRRDSDGVDGFEAEGKKDKK 
DYDNF (SEQ ID NO:785). Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

This gene is expressed primarily in Human Hippocampus and to a lesser extent 
in Prostate, Human Frontal Cortex. 

30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders related to reproductive system and nervous system. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

35 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the reproductive 
system and nervous system, expression of this gene at significantly higher or lower 
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levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution and homology to human M-phase phosphoprotein 4 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
the diagnosis and treatment of reproductive and nervous system disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 197 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MGSQHSAAARPSSCRRKQEDDRDG (SEQ ID NO:786); 

I I A PR FOPP A I A OPPV\/PCTr. D nc ttpi rr- /opa ma , 

QGTGYIPTEQVNELVALIPHSDQRLRPQRTKQYV (SEQ ID NO: 788). 
Polynucleotides encoding these polypeptides are also encompassed by the invention. 

This gene is expressed primarily in Human Primary Breast Cancer and to a 
lesser extent in Human Adult Spleen, Hodgkin's Lymphoma I, Salivary Gland. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer and immunal disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the cancer and immune system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e.. 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 430 as residues: Ser-126 to Gly-138. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of cancer and 
immunal dKnrdorv 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 198 

This gene is expressed primarily in monocytes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the lissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, blood cell disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues ) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of blood cell 
disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 199 

This gene is expressed primarily in Human Ovary and Synovia and to a lesser 
extent in Human 8 Week Whole Embryo. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, reproductive and developmental disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the reproductive and developmental system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e.. 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of reproductive 
and developmental disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 200 

This gene maps to chromosome 8 and therefore polynucleotides of the invention 
can be used in linkage analysis as a marker for chromosome 8. The translation product 
ot this gene shares limited sequence homology with collagen proline rich domain. 

This gene is expressed primarily in CNS. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological diseases. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 

tlSSIlPS Or rplU n:irtirn1'trl\; lK« n^^r^.r f ,„ t ^ : _ _ r , i 

~ - , y — j V11V ^ vum :>it.iii, LAjJiCVSJUil UI LUIS i^ene ill 

significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual havin" 

such a disorder, relative to the standard gene expression level i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 433 as residues: 
Pro-35 to Asp-41. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neurological 
diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 201 

Translation product of this gene shares homology with a mammalian histone 
Hla protein. One embodiment for this gene is the polypeptide fragments comprising the 
following ammo acid sequence: ARLNVGRESEKREMEKSQGVKVSESPMGAR 
1 1 S S W PEG A A PC K K V QG AQM QFPPR R (SEQ ID NO:789); ARENVGRESLKR 
EML (SEQ ID NO: 790); LKSQGVKVSESPMGARHSSW (SEQ ID NO: 791 ); 
AFCKKVQGAQMQFPPRR (SEQ ID NO:792). An additional embodiment is the 
polynucleotide fragments encoding these polypeptide (Sec Accession No. pirIS2417S) 
fragments. 
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not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell typc(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
5 significantly higher or lower levels may be routinely detected in certain tissues (e.e., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

10 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for diagnosis and treatment of immune disorders. 
Since the gene is expressed in cells of lymphoid origin, the natural gene product may be 
involved in vital immune functions. Therefore it may be also used as an agent for 
immunological disorders including arthritis, asthma, immune deficiency diseases such 

15 as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 202 

This gene is expressed primarily in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

20 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 

25 tissues or cells, particularly of the immune system, expression of this gene at 

significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

30 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of immune disorders. 
Since the gene is expressed in cells of lymphoid origin, the natural gene product may be 
involved in immune functions. Therefore it may be also used as an agent for 

35 immunological disorders including arthritis, asthma, immune deficiency diseases such 
as AIDS, and leukemia. 



BNSDCViD - WC -B54963A2 I 



WO 98/54963 



PCT7US98/11422 



157 



FEATURES OF PROTEIN F:NCODED BY GENE NO: 203 

This gene is expressed primarily in Neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but arc 
not limited to. infectious disorders, immune disorders, and cancers. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell typefsl For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

plasma, urine, svnovial fluid or sninnl fimH\ ™- n 
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an individual having such a disorder, relative to the standard gene expression level, ,.e 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 436 as residues: Thr-3 1 to Lys-36. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of infectious 
disorders, immune disorders, and cancers. Since the gene is expressed in cells of 
lymphoid origin, the natural gene product may be involved in immune functions. 
Therefore it may be also used as an agent for immunological disorders including 
arthritis, asthma, immune deficiency diseases such as AIDS, and leukemia. Protein, as 
well as. antibodies directed against the protein may show utility as a tumor marker 
and/or immunotherapy targets for the above listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 204 

This gene maps to chromosome 16 and therefore polynucleotides of the 
invention can be used in linkage analysis as markers for chromosome 16. The 
translation product of this gene shares sequence homology with lactate dehydrogenase 
which is thought to be important in lactate metabolism. 

This gene is expressed primarily in human tonsils and to a lesser extent in 
Spleen, and Neutrophils. 

I herefore. polynucleotides and poK -peptide .,ro,.. .. . , 
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polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes tor differential identification of the ttssue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune 
disorders, infectious disorders, and cancers, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO: 437 as residues: Gly-7 to Ser-12. 

The tissue distribution and homology to lactate dehydrogenase gene indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and treatment of immune disorders, infectious disorders, and cancers. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 205 

The translation product of this gene shares sequence homology with Gcapl 
protein which is developmentally regulated in brain. 

This gene is expressed primarily in placenta and endometrial tumor and to a 
lesser extent in several other tumors. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, vasculogenesis/angiogenesis and tumongenesis. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
25 probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the vascular system and tumors, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to Gcapl protein indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of disorder or dysfunction of vascular system of tumongenesis. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 206 

In specific embodiments, polypeptides of the invention comprise the sequence 
MPYAQWLAENDRFEEAQKAFHKAGRQREA (SEQ ID NO:799); 
VQVLEQLTNNAVAESRFNDAAYYYWMLSMQCEDIAQD (SEQ ID NO:794); 
5 PAQKDTMLGKEYHEQRLAELYHGYHAIHRHTEDP (SEQ ID NO: 795); 
FSVHRPE TLFNLSRFLLHSLPKDTPSGISKVKILFT (SEQ ID NO:800); 
LAKQSKALGAYRLARHAYDKLRGLY1P (SEQ ID NO:796); ARFQKSIELG 
TLTIRAKPFHDSEELVPLCYRCSTNN (SEQ ID NO: 797); and/or PLLNNLGNVC 

INCRQPFIFSASSYDVEHLVEFYLEEGITDEEAISLIDLEVLRPKRDDRQLEICKQQ 
10 EPDSGG (SEQ ID NO:798). Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

This gene is expressed primarily in testes. 

Therefore, polynucleotides and polypeptides of die invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

15 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to. male reproductive and endocrine disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the reproductive and endocrine systems, 

20 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

25 disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment of male reproductive and endocrine 
disorders. 



M) FEATURES OF PROTEIN ENCODED BY GENE NO: 207 

This gene is expressed in fetal lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

bioloi T i( s ;tI sample :md for dirpMTo 1 -^ ^ r ■ : * 1 ■ > > • • i i < 

loi ditleiential identification ol the tissue(s) or cell typeio. For a number of disorders 
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of the above tissues or cells, particularly of the respiratory system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g.. serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 440 as residues: Tyr-49 to Cys~54. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for detection and treatment of disorders associated 
with developing lungs particularly in premature infants where the lungs are the last 
tissues to develop. The tissue distribution indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis and intervention of 
lung tumors since the gene may be involved in the regulation of cell division, 
particularly since it is expressed in fetal tissue. Protein, as well as, antibodies directed 
against the protein may show utility as a tumor marker and immunotherapy targets for 
the above listed tumors and tissues. 
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Table 1 summarizes the information corresponding to each "Gene No." described 
above. The nucleotide sequence identified as "NT SEQ ID NO:X" was assembled from 
partially homologous ("overlapping") sequences obtained from the "cDNA clone ID" 
identified in Table 1 and. in some cases, from additional related DNA clones. The 
overlapping sequences were assembled into a single contiguous sequence of high 
redundancy (usually three to five overlapping sequences at each nucleot.de position), 
resulting in a final sequence identified as SEQ ID NO:X. 

The cDNA Clone ID was deposited on the date and given the corresponding 
deposit number listed in "ATCC Depos.t No:Z and Date." Some of the deposits contain 
multiple different clones corresponding to the same gene. "Vector" refers to the type of 
vector contained in the cDNA Clone ID. 

'• Total NT Seq." refers to the total number of nucleotides in the contig identified 

-j , ,^ u^Mitu <_iune may contain all or most of these sequences. 

reflected by the nucleotide position indicated as "5' NT of Clone Seq." and the '\V NT 
of Clone Seq." of SEQ ID NO:X. The nucleotide position of SEQ ID NO:X of the 
putative start codon (methionine) is identified as "5' NT of Start Codon." Similarly 
the nucleotide position of SEQ ID NO:X of the predicted signal sequence is identified as 
"5' NT of First A A of Signal Pep." 

The translated amino acid sequence, beginning with the methionine, is identified 
as "AA SEQ ID NO:Y," although other reading frames can also be easily translated 
using known molecular biology techniques. The polypeptides produced by these 
alternative open reading frames are specifically contemplated by the present invention. 

The first and last amino acid position of SEQ ID NO:Y of the predicted sienal 
pept.de is identified as "First AA of Sig Pep" and "Last A A of Sig Pep." The predicted 
first amino acid position of SEQ ID NO: Y of the secreted portion is identified as 
"Predicted First AA of Secreted Portion." Finally, the amino acid position of SEQ ID 
NO:Y of the last amino acid in the open reading frame is identified as "Last AA of 



ORF 



SEQ ID NO:X and the translated SEQ ID NO:Y are sufficiently accurate and 
otherwise suitable for a variety of uses well known in the art and described further 
below. For instance, SEQ ID NO:X ,s useful for designing nucleic acid hybridization 
probes that will detect nucleic acid sequences contained in SEQ ID NO:X or the cDNA 
contained in the deposited clone. These probes will also hybridize to nucleic acid 



- ' ; - ■■■■ -"i ai ^ \. lit.- . Vv i iK.ii t ■»] [i . 

by ibe d)NA clones identified in Table 1. 
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Nevertheless, DNA sequences generated by sequencing reactions can contain 
sequencing errors. The errors exist as misidentified nucleotides, or as insertions or 
deletions of nucleotides in the generated DNA sequence. The erroneously inserted or 
deleted nucleotides cause frame shifts in the reading frames of the predicted amino acid 
5 sequence. In these cases, the predicted amino acid sequence diverges from the actual 
amino acid sequence, even though the generated DNA sequence may be greater than 
99.9% identical to the actual DNA sequence (for example, one base insertion or deletion 
in an open reading frame of over 1000 bases). 

Accordingly, for those applications requiring precision in the nucleotide 

10 sequence or the amino acid sequence, the present invention provides not only the 

generated nucleotide sequence identified as SEQ ID NO:X and the predicted translated 
ammo acid sequence identified as SEQ ID NO:Y, but also a sample of plasmid DNA 
containing a human cDNA of the invention deposited with the ATCC, as set forth in 
Table 1 . The nucleotide sequence of each deposited clone can readily be determined by 

15 sequencing the deposited clone in accordance with known methods. The predicted 
ammo acid sequence can then be verified from such deposits. Moreover, the amino 
acid sequence of the protein encoded by a particular clone can also be directly 
determined by peptide sequencing or by expressing the protein in a suitable host cell 
containing the deposited human cDNA, collecting the protein, and determining its 

20 sequence. 

The present invention also relates to the genes corresponding to SEQ ID NO:X, 
SEQ ID NO: Y, or the deposited clone. The corresponding gene can be isolated in 
accordance with known methods using the sequence information disclosed herein. 
Such methods include preparing probes or primers from the disclosed sequence and 
25 identifying or amplifying the corresponding gene from appropriate sources of genomic 
material. 

Also provided in the present invention are species homologs. Species 
homologs may be isolated and identified by making suitable probes or primers from the 
sequences provided herein and screening a suitable nucleic acid source for the desired 
30 homologue. 

The polypeptides of the invention can be prepared in any suitable manner. Such 
polypeptides include isolated naturally occurring polypeptides, recombinant^ produced 
polypeptides, synthetically produced polypeptides, or polypeptides produced by a 
combination of these methods. Means for preparing such polypeptides are well 
35 understood in the art. 

The polypeptides may be in the form of the secreted protein, including the 
mature form, or may be a part of a larger protein, such as a fusion protein (see below). 
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It is often advantageous to include an additional ammo acid sequence which contains 
secretory or leader sequences, pro-sequences, sequences which aid in purification , 
such as multiple histidine residues, or an additional sequence for stability durina 
recombinant production. 
5 The polypeptides of the present invention are preferably provided in an isolated 

form, and preferably are substantially purified. A recombinantly produced version of a 
polypeptide, including the secreted polypeptide, can be substantially purified by the 
one-step method described in Smith and Johnson, Gene 67:31-40 (1988). 
Polypeptides of the invention also can be purified from natural or recombinant sources 
10 using antibodies of the invention raised against the secreted protein in methods which 
are well known in the art. 

Signal Sequences 

Methods for predicting whether a protein has a signal sequence, as well as the 

15 cleavage point for that sequence, are available. For instance, the method of McGeoch, 
Virus Res. 3:271-286 (1985), uses the information from a short N-terminal charged 
region and a subsequent uncharged region of the complete (uncleaved) protein. The 
method of von Heinjc, Nucleic Acids Res. 14:4683-4690 ( 1986) uses the information 
from the residues surrounding the cleavage site, typically residues -13 to +2, where +1 

20 indicates the amino terminus of the secreted protein. The accuracy of predicting the 

cleavage points of known mammalian secretory proteins for each of these methods is in 
the range of 75-80%. (von Heinjc, supra.) However, the two methods do not always 
produce the same predicted cleavage point(s) for a given protein. 

In the present case, the deduced amino acid sequence of the secreted polypeptide 

25 was analyzed by a computer program called SignalP (Henrik Nielsen et al.. Protein 
Engineering 10: 1-6 ( 1997)), which predicts the cellular location of a protein based on 
the amino acid sequence. As part of this computational prediction of localization, the 
methods of McGeoch and von Heinjc are incorporated. The analysis of the ammo acid 
sequences of the secreted proteins described herein by this program provided the results 

30 shown in Fable 1 . 

As one of ordinary skill w ould appreciate, however, cleavage sites sometimes 
vary from organism to organism and cannot be predicted with absolute certainty. 
Accordingly, the present invention provides secreted polypeptides having a sequence 
^hown m SFO ID NO Y n hu Mvn- -n V r-M..,-,-- : 
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uniform, resulting in more than one secreted species. These polypeptides, and the 
polynucleotides encoding such polypeptides, are contemplated by the present invention. 

Moreover, the signal sequence identified by the above analysis may not 
necessarily predict the naturally occurring signal sequence. For example, the naturally 
5 occurring signal sequence may be further upstream from the predicted signal sequence. 
However, it is likely that the predicted signal sequence will be capable of directing the 
secreted protein to the ER, These polypeptides, and the polynucleotides encoding such 
polypeptides, are contemplated by the present invention. 

1° Polynucleotide and Polypeptide Variants 

"Variant" refers to a polynucleotide or polypeptide differing from the 
polynucleotide or polypeptide of the present invention, but retaining essential properties 
thereof. Generally, variants are overall closely similar, and, in many regions, identical 
to the polynucleotide or polypeptide of the present invention. 

1 5 B Y a polynucleotide having a nucleotide sequence at least, for example, 95% 

"identical" to a reference nucleotide sequence of the present invention, it is intended that 
the nucleotide sequence of the polynucleotide is identical to the reference sequence 
except that the polynucleotide sequence may include up to five point mutations per each 
100 nucleotides of the reference nucleotide sequence encoding the polypeptide. In other 

20 words, to obtain a polynucleotide having a nucleotide sequence at least 95% identical to 
a reference nucleotide sequence, up to 5% of the nucleotides in the reference sequence 
may be deleted or substituted with another nucleotide, or a number of nucleotides up to 
5% of the total nucleotides in the reference sequence may be inserted into the reference 
sequence. The query sequence may be an entire sequence shown inTable 1, the ORF 

25 (open reading frame), or any fragement specified as described herein. 

As a practical matter, whether any particular nucleic acid molecule or 
polypeptide is at least 90%, 95%, 96%, 97%, 98% or 99% identical to a nucleotide 
sequence of the presence invention can be determined conventionally using known 
computer programs. A preferred method for determing the best overall match between 

30 a query sequence (a sequence of the present invention) and a subject sequence, also 
referred to as a global sequence alignment, can be determined using the FASTDB 
computer program based on the algorithm of Brutlag et al. (Comp. App. Biosci. (1990) 
6:237-245). In a sequence alignment the query and subject sequences are both DNA 
sequences. An RNA sequence can be compared by converting LPs to T's. The result 

35 of said global sequence alignment is in percent identity. Preferred parameters used in a 
FASTDB alignment of DNA sequences to calculate percent identiy are: 
Matnx=Unitary, k-tuple=4. Mismatch Penalty=l, Joining Penalty=30. Randomization 
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Group Lcngth=0, Cutoff Score= 1 . Gap Penalty=5. Gap Srze Penalty 0.05. Window 
Size=500 or the lenght of the subject nucleotide sequence, whichever is shorter. 

If the subject sequence is shorter than the query sequence because of 5' or 3' 
deletions, not because of internal deletions, a manual correction must be made to the 
results. This is becuase the FASTDB program does not account for .V and 3' 
truncations of the subject sequence when calculating percent identity. For subject 
sequences truncated at the 5' or 3' ends, relative to the the query sequence, the percent 
identity is corrected by calculating the number of bases of the query sequence that are 5' 
and 3' of the subject sequence, which are not matched/aligned, as a percent of the total 
bases of the query sequence. Whether a nucleotide is matched/aligned is determined by 
results of the FASTDB sequence alignment. This percentage is then subtracted from 
the percent identity, calculated by the above FASTDB program using the specified 

r , „ M u Ftlv _ clu lucIU ny score , nis corrccted score js what js 

used for the purposes of the present invention. Only bases outside the 5" and 3' bases 
of the subject sequence, as displayed by the FASTDB alignment, which are not 
matched/aligned with the query sequence, are calculated for the purposes of manually 
adjusting the percent identity score. 

For example, a 90 base subject sequence is aligned to a 100 base query 
sequence to determine percent identity. The deletions occur at the 5' end of the subject 
sequence and therefore, the FASTDB alignment does not show a matched/alignement of 
the first 10 bases at 5' end. The 10 unpaired bases represent 10% of the sequence 
(number of bases at the 5' and 3' ends not matched/total number of bases in the query 
sequence) so 10% is subtracted from the percent identity score calculated by the 
FASTDB program. If the remaining 90 bases were perfectly matched the final percent 
identity would be 90%. In another example, a 90 base subject sequence is compared 
with a 100 base query sequence. This time the deletions are internal deletions so that 
there are no bases on the 5' or y of the subject sequence which are not matched/aligned 
with the query. In this case the percent identity calculated by FASTDB is not manually 
corrected. Once again, only bases 5' and 3' of the subject sequence which are not 
matched/aligned with the query sequncc are manually corrected for. No other manual 
corrections are to made for the purposes of the present invention. 

By a polypeptide having an amino acid sequence at least, for example. 95% 
"identical" to a query amino acid sequence of the present invention, it is intended that 

tilt" .iniHlM :|C : .1 v, *,•■!,••>, - ..f'tl.. ,,),. .1. . ., ; ; ;| ,, . 

' : " ' ' ' 1 •»'!iHi»» . i, uk •>! i lie mu-,\ annuo , ■ i, i >c,i! i-.-n, ■ ).. ,. ., , - , ., , 
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ammo acid sequence, up to 5% of the amino acid residues in the subject sequence may 
be inserted, deleted, (indels) or substituted with another amino acid. These alterations 
of the reference sequence may occur at the amino or carboxy terminal positions of the 
reference amino acid sequence or anywhere between those terminal positions, 
5 interspersed either individually among residues in the reference sequence or in one or 
more contiguous groups within the reference sequence. 

As a practical matter, whether any particular polypeptide is at least 90%, 95%, 
96%, 97%. 98% or 99% identical to, for instance, the amino acid sequences shown in 
Table 1 or to the amino acid sequence encoded by deposited DNA clone can be 
0 determined conventionally using known computer programs. A preferred method for 
determing the best overall match between a query sequence (a sequence of the present 
invention) and a subject sequence, also referred to as a global sequence alignment, can 
be determined using the FASTDB computer program based on the algorithm of Brutlag 
et al. (Comp. App. Biosci. ( 1990) 6:237-245). In a sequence alignment the query and 
subject sequences are either both nucleotide sequences or both amino acid sequences. 
The result of said global sequence alignment is in percent identity. Preferred parameters 
used in a FASTDB amino acid alignment are: Matnx=PAM 0, k-tuple=2. Mismatch 
Penalty- 1, Joining Penalty=20. Randomization Group Length=0, Cutoff Score=l, 
Window Size=sequence length, Gap Penalty=5, Gap Size Penalty=0.05, Window 
20 Siz.e=500 or the length of the subject amino acid sequence, whichever is shorter. 

If the subject sequence is shorter than the query sequence due to N- or C- 
terminal deletions, not because of internal deletions, a manual correction must be made 
to the results. This is becuase the FASTDB program does not account for N- and C- 
terminal truncations of the subject sequence when calculating global percent identity. 
For subject sequences truncated at the N- and C-termini, relative to the the query 
sequence, the percent identity is corrected by calculating the number of residues of the 
query sequence that are N- and C-terminal of the subject sequence, which are not 
matched/aligned with a corresponding subject residue, as a percent of the total bases of 
the query sequence. Whether a residue is matched/aligned is determined by results of 
the FASTDB sequence alignment. This percentage is then subtracted from the percent 
identity, calculated by the above FASTDB program using the specified parameters, to 
arrive at a final percent identity score. This final percent identity score is what is used 
for the purposes of the present invention. Only residues to the N- and C-termini of the 
subject sequence, which are not matched/aligned with the query sequence, are 
considered for the purposes of manually adjusting the percent identity score. That is, 
only query residue positions outside the farthest N- and C-terminal residues of the 
subject sequence. 



25 
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For example, a 90 amino acid residue subject sequence is aligned with a 100 
residue query sequence to determine percent identity. The deletion occurs at the N- 
terminus of the subject sequence and therefore, the FASTDB alignment does not show 
a matching/alignment of the first 10 residues at the N-terminus. The 10 unpaired 
residues represent 107c of the sequence (number of residues at the N- and C- termini 
not matched/total number of residues in the query sequence) so 10% is subtracted from 
the percent identity score calculated by the FASTDB program. If the remaining 90 
residues were perfectly matched the final percent identity would be 90%. In another 
example, a 90 residue subject sequence is compared with a 100 residue query sequence. 
This time the deletions are internal deletions so there are no residues at the N- or ex- 
ternum of the subject sequence which are not matched/aligned with the query. In this 
case the percent identity calculated by FASTDB is not manually corrected. Once again, 
only residue positions outside the N- and C-ierminai ends of the subject sequence, as 
displayed in the FASTDB alignment, which are not matched/aligned with the query 
1 5 sequnce are manually corrected for. No other manual corrections are to made for the 
purposes of the present invention. 

The variants inay contain alterations in the coding regions, non-coding regions, 
or both. Especially preferred are polynucleotide variants containing alterations which 
produce silent substitutions, additions, or deletions, but do not alter the properties or 
20 activities of the encoded polypeptide. Nucleotide variants produced by silent 

substitutions due to the degeneracy of the genetic code are preferred. Moreover, 
variants in which 5-10, 1-5, or 1-2 amino acids are substituted, deleted, or added in any 
combination are also preferred. Polynucleotide variants can be produced for a variety 
of reasons, e.g., to optimize codon expression for a particular host (change codons in 
25 the human mRNA to those preferred by a bacterial host such as E. coli). 

Naturally occurring variants are called "allelic variants:' and refer to one of 
several alternate forms of a gene occupying a given locus on a chromosome of an 
organism. (Genes II, Lewin, B., ed., John Wiley & Sons, New York (1985).) These 
allelic variants can vary at either the polynucleotide and/or polypeptide level. 
Alternatively, non-naturally occurring variants may be produced by mutagenesis 
techniques or by direct synthesis. 

Using known methods of protein engineering and recombinant DNA 
technology, variants may be generated to improve or alter the characteristics of the 
polypeptide^ of the present inv-nt;,^ i . : . ♦ .,, 



t 1^-m. reported variant KGF proteins having heparin binding activity even after 
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deleting 3, 8, or 27 amino terminal amino acid residues. Similarly, interferon gamma 
exhibited up to ten times higher activity after deleting 8-10 amino acid residues from the 
carboxy terminus of this protein (Dobeli et al., J. Biotechnology 7:199-216 (1988).) 
Moreover, ample evidence demonstrates that variants often retain a biological 
5 activity similar to that of the naturally occurring protein. For example, Gayle and 

coworkers (J. Biol. Chem 268:22 105-221 1 1 (1993)) conducted extensive mutational 
analysis of human cytokine IL-la. They used random mutagenesis to generate over 
3,500 individual IL-la mutants that averaged 2.5 amino acid changes per variant over 
the entire length of the molecule. Multiple mutations were examined at every possible 
10 ammo acid position. The investigators found that "fm]ost of the molecule could be 

altered with little effect on either [binding or biological activity]." (See/Abstract.) In 
fact, only 23 unique amino acid sequences, out of more than 3,500 nucleotide 
sequences examined, produced a protein that significantly differed in activity from wild- 
type. 

15 Furthermore, even if deleting one or more amino acids from the N-terminus or 

C-terminus of a polypeptide results in modification or loss of one or more biological 
functions, other biological activities may still be retained. For example, the ability of a 
deletion variant to induce and/or to bind antibodies which recognize the secreted form 
will likely be retained when less than the majority of the residues of the secreted form 

20 are removed from the N-terminus or C-terminus. Whether a particular polypeptide 

lacking N- or C-terminal residues of a protein retains such immunogenic activities can 
readily be determined by routine methods described herein and otherwise known in the 



35 



art. 



Thus, the invention further includes polypeptide variants which show 
25 substantial biological activity. Such variants include deletions, insertions, inversions, 
repeats, and substitutions selected according to general rules known in the art so as 
have little effect on activity. For example, guidance concerning how to make 
phenotypically silent amino acid substitutions is provided in Bowie, J. U. et al., 
Science 247: 1306- 1 3 10 ( 1990), wherein the authors indicate that there are two main 
30 strategies for studying the tolerance of an amino acid sequence to change. 

The first strategy exploits the tolerance of amino acid substitutions by natural 
selection during the process of evolution. By comparing amino acid sequences in 
different species, conserved amino acids can be identified. These conserved amino 
acids are likely important for protein function. In contrast, the amino acid positions 
where substitutions have been tolerated by natural selection indicates that these 
positions are not critical for protein function. Thus, positions tolerating amino acid 
substitution could be modified while still maintaining biological activity of the protein. 
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The second strategy uses genetic engineering to introduce amino acid changes at 
specific positions of a cloned gene to identify regions critical for protein function. Vor 
example, site directed mutagenesis or alanine-scanning mutagenesis (introduction of 
single alanine mutations at every residue in the molecule) can be used. (Cunningham 
and Wells. Science 244:1081-1085 (1989).) The resulting mutant molecules can then 
be tested for biological activity. 

As the authors state, these two strategies have revealed that proteins are 
surprisingly tolerant of anuno acid substitutions. The authors further indicate which 
amino acid changes are likely to be permissive at certain amino acid positions in the 
protein. For example, most buried (within the tertiary structure of the protein) amino 
ac.d residues require nonpolar side chains, whereas few features of surface side chains 
are generally conserved. Moreover, tolerated conservative amino acid substitutions 

involve ri'nl ircm<.ni nf >k„ „i:„u„.:.. . .. I 1 » ■ 

W1 a„ F ,miit ui nyuiopnomc amino acids Ala, Val, Leu and He; 
replacement of the hydroxyl residues Ser and Thr; replacement of the acidic residues 
Asp and Glu; replacement of the amide residues Asn and Gin, replacement of the basic 
residues Lys, Arg. and His; replacement of the aromatic residues Phe, Tyr, and Trp, 
and replacement of the small-sized amino acids Ala, Ser, Thr. Met, and Gly. 

Besides conservative amino acid substitution, variants of the present invention 
include (i) . substitutions with one or more of the non-conserved amino acid residues, 
where the substituted amino acid residues may or may not be one encoded by the 
genetic code, or (ii) substitution with one or more of amino acid residues having a 
substitucnt group, or (iii) fusion of the mature polypeptide with another compound, 
such as a compound to increase the stability and/or solubility of the polypeptide (for 
example, polyethylene glycol), or (iv) fusion of the polypeptide w.th additional amino 
acids, such as an IgG Fc fusion region peptide, or leader or secretory sequence, or a 
sequence facilitating purification. Such variant polypeptides are deemed to be within 
the scope of those skilled in the art from the teachings herein. 

For example, polypeptide variants containing amino acid substitutions of 
charged ammo acids with other charged or neutral amino acids may produce proteins 
with improved characteristics, such as less aggregation. Aggregation of pharmaceutical 
formulations both reduces activity and increases clearance due to the aggregate's 
immunogenic activity. (Pinckard et al., Clin. Fxp. Immunol. 2:331-340 (1967): 
Robbins et al.. Diabetes 36: 838-845 ( 1987); Cleland et al.. Cnt. Rev. Therapeutic 
Drm* Carrier Sv^rmv tn-^n~7 11^ >. ioo^i , 
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Polynucleotide and Polypeptide Fragments 

In the present invention, a "polynucleotide fragment" refers to a short 
polynucleotide having a nucleic acid sequence contained in the deposited clone or 
shown in SEQ ID NO:X. The short nucleotide fragments are preferably at least about 
15 nt, and more preferably at least about 20 nt. still more preferably at least about 30 nt, 
and even more preferably, at least about 40 nt in length. A fragment "at least 20 nt in 
length." for example, is intended to include 20 or more contiguous bases from the 
cDNA sequence contained in the deposited clone or the nucleotide sequence shown in 
SEQ ID NO:X. These nucleotide fragments are useful as diagnostic probes and primers 
as discussed herein. Of course, larger fragments (e.g., 50, 150. 500. 600. 2000 
nucleotides) are preferred. 

Moreover, representative examples of polynucleotide fragments of the 
invention, include, for example, fragments having a sequence from about nucleotide 
number 1-50. 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-400. 401- 
450. 451-500, 501-550, 551-600, 651-700, 701-750, 751-800, 800-850, 851-900, 
901-950, 951-1000. 1001-1050, 1051-1100, 1101-1150, 1151-1200, 1201-1250, 
1251-1300. 1301-1350, 1351-1400, 1401-1450. 1451-1500, 1501-1550, 1551-1600, 
1601-1650, 1651-1700. 1701-1750. 1751-1800. 1801-1850, 1851-1900, 1901-1950, 
1951-2000, or 2001 to the end of SEQ ID NO:X or the cDNA contained in the 
deposited clone. In this context "about" includes the particularly recited ranges, larger 
or smaller by several (5, 4, 3. 2. or 1 ) nucleotides, at either terminus or at both termini. 
Preferably, these fragments encode a polypeptide which has biological activity. More 
preferably, these polynucleotides can be used as probes or primers as discussed herein. 

In the present invention, a "polypeptide fragment" refers to a short amino acid 
sequence contained in SEQ ID NO:Y or encoded by the cDNA contained in the 
deposited clone. Protein fragments may be "free-standing," or comprised within a 
larger polypeptide of which the fragment forms a part or region, most preferably as a 
single continuous region. Representative examples of polypeptide fragments of the 
invention, include, for example, fragments from about amino acid number 1-20, 21-40, 
41-60.61-80.81-100, 102-120, 121-140, 141-160. or 161 to the end of the coding 
region. Moreover, polypeptide fragments can be about 20, 30, 40, 50. 60, 70, 80, 90, 
100, 1 10. 120. 130, 140, or 150 amino acids in length. In this context "about" 
includes the particularly recited ranges. larger or smaller by several (5. 4, 3. 2, or 1 ) 
amino acids, at either extreme or at both extremes. 

Preferred polypeptide fragments include the secreted protein as well as the 
mature form. Further preferred polypeptide fragments include the secreted protein or 
the mature form having a continuous series of deleted residues from the amino or the 
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carboxy terminus, or both. For example, any number of amino acids, ranging from 1 - 
60, can be deleted from the amino terminus of either the secreted polypeptide or the 
mature form. Similarly, any number of amino acids, ranging from 1-30, can be deleted 
from the carboxy terminus of the secreted protein or mature form. Furthermore, any 
combination of the above amino and carboxy terminus deletions are preferred. 
Similarly, polynucleotide fragments encoding these polypeptide fragments are also 
preferred. 

Particularly, N-terminal deletions of the polypeptide of the present invention can 
be described by the general formula m-p, where p is the total number of amino acids in 
the polypeptide and m is an integer from 2 to (p- 1 ), and where both of these integers (m 
& p) correspond to the position of the amino acid residue identified in SEQ ID NO:Y. 

Moreover, C-terminal deletions of the polypeptide of the present invention can 
also be described by the general foi inula i-n 7 where n is an integer from 2 to (p-1), and 
again where these integers (n & p) correspond to the position of the amino acid residue 
identified in SEQ ID NO:Y. 

The invention also provides polypeptides having one or more amino acids 
deleted from both the amino and the carboxyl termini, which may be described 
generally as having residues m-n of SEQ ID NO:Y, where m and n are integers as 
described above. 

Also preferred are polypeptide and polynucleotide fragments characterized by 
structural or functional domains, such as fragments that comprise alpha-helix and alpha- 
helix forming regions, beta-sheet and beta-sheet-forming regions, turn and turn- 
forming regions, coil and coil-forming regions, hydrophihe regions, hydrophobic 
regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface- 
forming regions, substrate binding region, and high antigenic index regions. 
Polypeptide fragments of SEQ ID NO: Y falling within conserved domains are 
speed ically contemplated by the present invention. Moreover, polynucleotide 
fragments encoding these domains are also contemplated. 

Other preferred fragments are biologically active fragments. Biologically active 
fragments are those exhibiting activity similar, but not necessarily identical, to an 
activity of the polypeptide of the present invention. The biological activity of the 
fragments may include an improved desired activity, or a decreased undesirable activity. 

Fpitopcs X A nf ihorlicv; 
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epitope, as well as the polynucleotide encoding this f ragment. A region of a protein 
molecule to which an antibody can bind is defined as an ' antigenic epitope." In 
contrast, an "immunogenic epitope" is defined as a part of a protein that elicits an 
antibody response. (See, for instance, Geysen et al., Proc. Nat). Acad. Sci. USA 

5 81:3998- 4002 (1983).) 

Fragments which function as epitopes may be produced by any conventional 
means. (See, e.g., Houghten, R. A., Proc. Natl. Acad. Sci. USA 82:5131-5135 
(1985) further described in U.S. Patent No. 4,631,21 1.) 

In the present invention, antigenic epitopes preferably contain a sequence of at 

0 least seven, more preferably at least nine, and most preferably between about 15 to 
about 30 ammo acids. Antigenic epitopes are useful to raise antibodies, including 
monoclonal antibodies, that specifically bind the epitope. (See, for instance. Wilson ct 
al.. Cell 37:767-778 (1984); Sutcliffe, J. G. et al.. Science 219:660-666 (1983).) 

Similarly, immunogenic epitopes can be used to induce antibodies according to 

5 methods well known in the art. (See, for instance, Sutcliffe et ah, supra; Wilson et al., 
supra; Chow, M. ct al., Proc. Natl. Acad. Sci. USA 82:910-914; and Bittle, F. J. et 
al., J. Gen. Virol. 66:2347-2354 (1985).) A preferred immunogenic epitope includes 
the secreted protein. The immunogenic epitopes may be presented together with a 
carrier protein, such as an albumin, to an animal system (such as rabbit or mouse) or, if 

0 it is long enough (at least about 25 amino acids), without a carrier. However, 

immunogenic epitopes comprising as few as 8 to 10 amino acids have been shown to be 
sufficient to raise antibodies capable of binding to, at the very least, linear epitopes in a 
denatured polypeptide (e.g.. in Western blotting.) 

As used herein, the term "antibody" (Ab) or "monoclonal antibody" (Mab) is 

5 meant to include intact molecules as well as antibody fragments (such as, for example, 
Fab and F(ab')2 fragments) which are capable of specifically binding to protein. Fab 
and F(ab')2 fragments lack the Fx fragment of intact antibody, clear more rapidly from 
the circulation, and may have less non-specific tissue binding than an intact antibody. 
(Wahl et al., J. Nucl. Med. 24:3 16-325 ( 1983).) Thus, these fragments are preferred, 

0 as well as the products of a FAB or other immunoglobulin expression library. 
Moreover, antibodies of the present invention include chimeric, single chain, and 
humanized antibodies. 

Fusion Proteins 

5 Any polypeptide of the present invention can be used to generate fusion 

proteins. For example, the polypeptide of the present invention, when fused to a 
second protein, can be used as an antigenic tag. Antibodies raised against the 
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polypeptide of the present invention can be used to indirectly detect the second protein 
by binding to the polypeptide. Moreover because secreted proteins target cellular 
locations based on trafficking signals, the polypeptides of the present invention can be 
used as targeting molecules once fused to other proteins. 

Examples of domains that can be fused to polypeptides of the present invention 
include not only heterologous signal sequences, but also other heterologous functional 
regions. The fusion does not necessarily need to be direct, but may occur through 
linker sequences. 

Moreover, fusion proteins may also be engineered to improve characteristics of 
the polypeptide of the present invention. For instance, a region of additional amino 
acids, particularly charged amino acids, may be added to the N-terminus of the 
polypeptide to improve stability and persistence during purification from the host cell or 
subsequent handling and storage. Also, peptide moieties may be added to the 
polypeptide to facilitate purification. Such regions may be removed prior to final 
preparation of the polypeptide. The addition of peptide moieties to facilitate handling of 
polypeptides are familiar and routine techniques in the art. 

Moreover, polypeptides of the present invention, including fragments, and 
specifically epitopes, can be combined with parts of the constant domain of 
immunoglobulins (IgG), resulting in chimeric polypeptides. These fusion proteins 
facilitate purification and show an increased half-life in vivo. One reported example 
describes chimeric proteins consisting of the first two domains of the human CD4- 
polypeptide and various domains of the constant regions of the heavy or light chains of 
mammalian immunoglobulins. (EP A 394,827; Traunecker et al.. Nature 331 :84-86 
(1988).) Fusion proteins having disulfide-linked dimeric structures (due to the IgG) 
can also be more efficient in binding and neutralizing other molecules, than the 
monomeric secreted protein or protein fragment alone. (Fountoulakis et al., J. 
Biochem. 270:3958-3964 (1995).) 

Similarly, EP-A-O 464 533 (Canadian counterpart 2045869) discloses fusion 
proteins comprising various portions of constant region of immunoglobulin molecules 
together with another human protein or part thereof. In many cases, the Fc part in a 
fusion protein is beneficial in therapy and diagnosis, and thus can result in, for 
example, improved pharmacokinetic properties. (EP A 0232 262.) Alternatively, 
deleting the Fc part after the fusion protein has been expressed, detected, and purified, 

•11! 1 . ! J ..'.IT 
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purpose of high-throughput screening assays to identify antagonists of hlE-5. (See. IX 
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Bennett et al.. J. Molecular Recognition 8:52-58 (1995); K. Johanson et al.. J. Biol. 
Chem. 270:9459-9471 (1995).) 

Moreover, the polypeptides of the present invention can he fused to marker 
sequences, such as a peptide which facilitates purification of the fused polypeptide. In 
5 preferred embodiments, the marker amino acid sequence is a hexa-histidine peptide, 
such as the tag provided in a pQE vector (QIAGHN, Inc., 9259 Eton Avenue, 
Chatsworth, CA, 91311), among others, many of which are commercially available. 
As described in Gentz et al., Proc. Natl. Acad. Sci. USA 86:821-824 (1989), for 
instance, hexa-histidine provides for convenient purification of the fusion protein. 
10 Another peptide tag useful for purification, the "HA" tag, corresponds to an epitope 

derived from the influenza hemagglutinin protein. (Wilson et al.. Cell 37:767 (1984).) 

Thus, any of these above fusions can be engineered using the polynucleotides 
or the polypeptides of the present invention. 

15 Vectors, Host Cells, and Protein Production 

The present invention also relates to vectors containing the polynucleotide of the 
present invention, host cells, and the production of polypeptides by recombinant 
techniques. The vector may be, for example, a phage, plasmid, viral, or retroviral 
vector. Retroviral vectors may be replication competent or replication defective. In the 

20 latter case, viral propagation generally will occur only in complementing host cells. 

The polynucleotides may be joined to a vector containing a selectable marker for 
propagation in a host. Generally, a plasmid vector is introduced in a precipitate, such 
as a calcium phosphate precipitate, or in a complex with a charged lipid. If the vector is 
a virus, it may be packaged in vitro using an appropriate packaging cell line and then 

25 transduced into host cells. 

The polynucleotide insert should be operatively linked to an appropriate 
promoter, such as the phage lambda PL promoter, the E. cob lac, trp, phoA and tac 
promoters, the SV40 early and late promoters and promoters of retroviral LTRs, to 
name a few. Other suitable promoters will be known to the skilled artisan. The 

30 expression constructs will further contain sites for transcription initiation, termination, 
and, in the transcribed region, a ribosome binding site for translation. The coding 
portion of the transcnpts expressed by the constructs will preferably include a 
translation initiating codon at the beginning and a termination codon (UAA, UGA or 
UAG) appropriately positioned at the end of the polypeptide to be translated. 

35 As indicated, the expression vectors will preferably include at least one 

selectable marker. Such markers include dihydrofolate reductase, G418 or neomycin 
resistance for eukaryotic cell culture and tetracycline, kanamycin or ampicillin resistance 
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genes lor culturing in E. coli and other bacteria. Representative examples of 
appropriate hosts include, but are not limited to. bacterial cells, such as E. coli. 
Streptomyces and Salmonella typhimunum cells; fungal cells, such as yeast cells; insect 
cells such as Drosophila S2 and Spodoptera Sf9 cells; animal cells such as CHO. COS. 
293, and Bowes melanoma cells; and plant cells. Appropriate culture mediums and 
conditions for the above-described host cells are known in the art. 

Among vectors preferred for use in bacteria include pQE70, pQE60 and pQE-9 
available from QIAGEN, Inc.; pBlucscr.pt vectors. Phagescr.pt vectors. P NH8A. 
pNHl6a, pNHISA. P NH46A. available from Stratagene Cloning Systems Inc - and 
ptrc99a. pKK223-3, pKK233-3, pDR540, pRIT5 available from Pharmacia Biotech, 
Inc. Among preferred eukaryotic vectors arc pWLNEO, pSV2CAT. pOG44, pXTl 
and pSG available from Stratagene; and pSVK3, pBPV. pMSG and pSVL available 

IrOlTl Pharmari:> Ofhf-r .nitaKlo ...:n i. . . i-, . 

"""" -~ t'c icauiiy apparent to the skilled artisan. 

Introduction of the construct into the host cell can be effected by calcium 
phosphate transfect.on. DEAE-dextran mediated transfection, cationic l.pid-mediated 
transfection, electroporat.on, transduction, infection, or other methods. Such methods 
are described m many standard laboratory manuals, such as Davis et al., Basic Methods 
In Molecular Biology (1986). It is specifically contemplated that the polypeptides of the 
present invention may in fact be expressed by a host cell lacking a recombinant vector. 

A polypeptide of this invention can be recovered and purified from recombinant 
cell cultures by well-known methods including ammonium sulfate or ethanol 
precipitation, acid extraction, anion or cation exchange chromatography, 
phosphocellulose chromatography, hydrophobic interaction chromatography, affinity 
chromatography, hydroxylapatite chromatography and lectin chromatography. Most 
preferably, high performance liquid chromatography ("HPLC") is employed for 
purification. 

Polypeptides of the present invention, and preferably the secreted form, can also 
be recovered from: products purified from natural sources, including bodily fluids, 
tissues and cells, whether directly isolated or cultured; products of chemical synthetic 
procedures; and products produced by recombinant techniques from a prokaryotic or 
eukaryotic host, including, for example, bacterial, yeast, higher plant, insect, and 
mammalian cells. Depending upon the host employed in a recombinant production 
procedure, the polypeptides of the present invention may be glycosylated or may be 
non-«lvcosvlated 1- id.liti.~-- ,.. J. .,„, i . , .u , . , , , 
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after translation in all eukaryotic cells. While the N-tenninal methionine on most 
proteins also is efficiently removed in most prokaryotcs. for some proteins, this 
prokaryotic removal process is inefficient, depending on the nature of the amino acid to 
which the N-terminal methionine is covalently linked. 

Uses of the Polynucleotides 

Each of the polynucleotides identified herein can be used in numerous ways as 
reagents. The following description should be considered exemplary and utilizes 
known techniques. 

The polynucleotides of the present invention arc useful for chromosome 
identification. There exists an ongoing need to identify new chromosome markers, 
since few chromosome marking reagents, based on actual sequence data (repeat 
polymorphisms), are presently available. Each polynucleotide of the present invention 
can be used as a chromosome marker. 

Briefly, sequences can be mapped to chromosomes by preparing PCR primers 
(preferably 15-25 bp) from the sequences shown in SEQ ID NO:X. Primers can be 
selected using computer analysis so that primers do not span more than one predicted 
exon in the genomic DNA. These primers are then used for PCR screening of somatic 
cell hybrids containing individual human chromosomes. Only those hybrids containing 
the human gene corresponding to the SEQ ID NO:X will yield an amplified fragment. " 

Similarly, somatic hybrids provide a rapid method of PCR mapping the 
polynucleotides to particular chromosomes. Three or more clones can be assigned per 
day using a single thermal cycler. Moreover, sublocalization of the polynucleotides can 
be achieved with panels of specific chromosome fragments. Other gene mapping 
strategies that can be used include in situ hybridization, prescreening with labeled How- 
sorted chromosomes, and preselection by hybridization to construct chromosome 
specific-cDNA libraries. 

Precise chromosomal location of the polynucleotides can also be achieved using 
fluorescence in situ hybridization (FISH) of a metaphase chromosomal spread. This * 
technique uses polynucleotides as short as 500 or 600 bases; however, polynucleotides 
2.000-4.000 bp are preferred. For a review of this technique, see Verma et al., 
"Human Chromosomes: a Manual of Basic Techniques," Pergamon Press. New York 
(1988). 

For chromosome mapping, the polynucleotides can be used individually (to 
mark a single chromosome or a single site on that chromosome) or in panels (for 
marking multiple sites and/or multiple chromosomes). Preferred polynucleotides 
correspond to the noncoding regions of the cDNAs because the coding sequences are 
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more likely conserved within gene families, thus increasing the chance of cross 
hybridization during chromosomal mapping. 

Once a polynucleotide has been mapped to a precise chromosomal location, the 
physical position of the polynucleotide can be used in linkage analysis. Linkage 
analysis establishes coinheritance between a chromosomal location and presentation of a 
particular disease. (Disease mapping data are found, for example, in V. McKusick, 
Mendehan Inheritance in Man (available on line through Johns Hopkins University 
Welch Medical Library) .) Assuming 1 megabase mapping resolution and one gene per 
20 kb, a cDNA precisely localized to a chromosomal region associated with the disease 
could be one of 50-500 potential causative genes. 

Thus, once coinheritance is established, differences in the polynucleotide and 
the corresponding gene between affected and unaffected individuals can be examined. 

First, visible <;tmrtiir-il niior.,ti„n,' ;„ .u., „i ■ ... 

^imjiiHJMJines, sucn as deletions or 

translocations, are examined in chromosome spreads or by PCR. If no structural 
alterations exist, the presence of point mutations are ascertained. Mutations observed in 
some or all affected individuals, but not in normal individuals, indicates that the 
mutation may cause the disease. However, complete sequencing of the polypeptide and 
the corresponding gene from several normal individuals is required to distinguish the 
mutation from a polymorphism. If a new polymorphism is identified, this polymorphic 
polypeptide can be used for further linkage analysis. 

Furthermore, increased or decreased expression of the gene in affected 
individuals as compared to unaffected individuals can be assessed using 
polynucleotides of the present invention. Any of these alterations (altered expression, 
chromosomal rearrangement, or mutation) can be used as a diagnostic or prognostic 
marker. 

In addition to the foregoing, a polynucleotide can be used to control gene 
expression through triple helix formation or antisensc DNA or RNA. Roth methods 
rely on binding of the polynucleotide to DNA or RNA. For these techniques, preferred 
polynucleotides are usually 20 to 40 bases in length and complementary to either the 
region of the gene involved in transcription (triple helix - see Lee et al.. Nucl. Acids 
Res. 6:3073 (1979); Cooney et al.. Science 241:456 (1988); and Dcrvan et al.. Science 
251 : 1360 ( 1991 ) ) or to the mRNA itself (antisense - Okano. J. Neurochem. 56:560 
(1991); Oligodeoxy-nucleotides as Antisense Inhibitors of Gene Lxpression. CRC 
Press. Boca Raton FI MOXXM f ""nl- • 
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systems, and the information disclosed herein can be used to design antisense or triple 
helix polynucleotides in an effort to treat disease. 

Polynucleotides of the present invention are also useful in gene therapy. One 
goal of gene therapy is to insert a normal gene into an organism having a defective 
gene, in an effort to correct the genetic defect. The polynucleotides disclosed in the 
present invention offer a means of targeting such genetic defects in a highly accurate 
manner. Another goal is to insert a new gene that was not present in the host genome, 
thereby producing a new trait in the host cell. 

The polynucleotides are also useful for identifying individuals from minute 
biological samples. The United States military, for example, is considering the use of 
restriction fragment length polymorphism (RFLP) for identification of its personnel. In 
this technique, an individuals genomic DNA is digested with one or more restriction 
enzymes, and probed on a Southern blot to yield unique bands for identifying 
personnel. This method does not suffer from the current limitations of "Dog Tass" 
15 which can be lost, switched, or stolen, making positive identification difficult. The 
polynucleotides of the present invention can be used as additional DNA markers for 
RFLP. 

The polynucleotides of the present invention can also be used as an alternative to 
RFLP, by determining the actual base-by-base DNA sequence of selected portions of an 

20 individual's genome. These sequences can be used to prepare PCR primers for 

amplifying and isolating such selected DNA, which can then be sequenced. Using this 
technique, individuals can be identified because each individual will have a unique set 
of DNA sequences. Once an unique ID database is established for an individual, 
positive identification of that individual, living or dead, can be made from extremely 

25 small tissue samples. 

Forensic biology also benefits from using DNA-based identification techniques 
as disclosed herein. DNA sequences taken from very small biological samples such as 
tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, semen, etc., can be 
amplified using PCR. In one prior art technique, gene sequences amplified from 

30 polymorphic loci, such as DQa class II HLA gene, are used in forensic biology to 

identify individuals. (Erlich, R, PCR Technology, Freeman and Co. (1992).) Once 
these specific polymorphic loci are amplified, they are digested with one or more 
restriction enzymes, yielding an identifying set of bands on a Southern blot probed with 
DNA corresponding to the DQa class II HLA gene. Similarly, polynucleotides of the 

35 present invention can be used as polymorphic markers for forensic purposes. 

There is also a need for reagents capable of identifying the source of a particular 
tissue. Such need arises, for example, in forensics when presented with tissue of 
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unknown origin. Appropriate reagents can comprise, for example, DNA probes or 
primers specific to particular tissue prepared from the sequences of the present 
invention. Panels of such reagents can identify tissue by species and/or by oman type. 
In a similar fashion, these reagents can be used to screen tissue cultures for 
contamination. 

In the very least, the polynucleotides of the present invention can be used as 
molecular weight markers on Southern gels, as diagnostic probes for the presence of a 
specific niRNA in a particular cell type, as a probe to 'subtract-out" known sequences 
in the process of discovering novel polynucleotides, for selecting and making oligomers 
tor attachment to a "gene chip" or other support, to raise anti-DNA antibodies using 
DNA immunization techniques, and as an antigen to elicit an immune response. 



Uses of the Polypeptides 



Hach of the polypeptides identified herein can be used in numerous ways. The 
15 following description should be considered exemplary and utilizes known techniques. 

A polypeptide of the present invention can be used to assay protein levels in a 
biological sample using antibody-based techniques. For example, protein expression in 
tissues can be studied with classical immunohistological methods. (Jalkanen, fvL, et 
aL, J. Cell. Biol. 101:976-985 (1985); Jalkanen, M., et ah, J. Cell . Biol. 105:3087- 
20 3096 (1987).) Other antibody-based methods useful for detecting protein gene 

expression include immunoassays, such as the enzyme linked immunosorbent assay 
(ELISA) and the radioimmunoassay (RIA). Suitable antibody assay labels are known 
in the art and include enzyme labels, such as, glucose oxidase, and radioisotopes, such 
as iodine (1251, 1211), carbon (14C), sulfur (35S). tritium (3H). indium { I I2In). and 
25 technetium <99mTc), and fluorescent labels, such as fluorescein and rhodamine, and 
biotin. 

In addition to assaying secreted protein levels in a biological sample, proteins 
can also be detected in vivo by imaging. Antibody labels or markers for in vivo 
imaging of protein include diose detectable by X-radiography, NMR or HSR. For X- 
30 radiography, suitable labels include radioisotopes such as barium or cesium, which emit 
detectable radiation but are not overtly harmful to the subject. Suitable markers for 
NMR and ESR include those with a detectable characteristic spin, such as deuterium, 
which may be incorporated into the antibody by labeling of nutrients for the relevant 
JivHridnnvt 
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resonance, is introduced (for example, parenterally, subcutancousiy, or 
intraperitoneal^) into the mammal. It will be understood in the art that the size of the 
subject and the imaging system used will determine the quantity of imaging moiety 
needed to produce diagnostic images. In the case of a radioisotope moiety, for a human 
subject, the quantity of radioactivity injected will normally range from about 5 to 20 
millicunes of 99mTc. The labeled antibody or antibody fragment will then 
preferentially accumulate at the location of cells which contain the specific protein. In 
vivo tumor imaging is described in S.W. Burchiel et ah, "Immunopharmacokinetics of 
Radiolabeled Antibodies and Their Fragments." (Chapter 13 in Tumor Imaging: The 
Radiochemical Detection of Cancer, S.W. Burchiel and B. A. Rhodes, eds., Masson 
Publishing Inc. (1982).) 

Thus, the invention provides a diagnostic method of a disorder, which involves 
(a) assaying the expression of a polypeptide of the present invention in cells or body 
fluid of an individual; (b) comparing the level of gene expression with a standard gene 
expression level, whereby an increase or decrease in the assayed polypeptide gene 
expression level compared to the standard expression level is indicative of a disorder. 

Moreover, polypeptides of the present invention can be used to treat disease. 
For example, patients can be administered a polypeptide of the present invention in an 
effort to replace absent or decreased levels of the polypeptide (e.g., insulin), to 
supplement absent or decreased levels of a different polypeptide (e.g., hemoglobin S 
for hemoglobin B), to inhibit the activity of a polypeptide (e.g., an oncogene), to 
activate the activity of a polypeptide (e.g., by binding to a receptor), to reduce the 
activity of a membrane bound receptor by competing with it for free ligand (e.g., 
soluble TNF receptors used in reducing inflammation), or to bring about a desired 
response (e.g., blood vessel growth). 

Similarly, antibodies directed to a polypeptide of the present invention can also 
be used to treat disease. For example, administration of an antibody directed to a 
polypeptide of the present invention can bind and reduce overproduction of the 
polypeptide. Similarly, administration of an antibody can activate the polypeptide, such 
as by binding to a polypeptide bound to a membrane (receptor). 

At the very least, the polypeptides of the present invention can be used as 
molecular weight markers on SDS-PAGE gels or on molecular sieve gel filtration 
columns using methods well known to those of skill in the art. Polypeptides can also 
be used to raise antibodies, which in turn are used to measure protein expression from a 
recombinant cell, as a way of assessing transformation of the host cell. Moreover, the 
polypeptides of the present invention can be used to test the following biological 
activities. 
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Biological Activities 

The polynucleotides and polypeptides of the present invention can be used in 
assays to test for one or more biological activities. If these polynucleotides and 
polypeptides do exhibit activity in a particular assay, it is likely that these molecules 
may be involved in the diseases associated with the biological activity. Thus, the 
polynucleotides and polypeptides could be used to treat the associated disease. 

Immune Activity 

A polypeptide or polynucleotide of the present invention may be useful in 
treating deficiencies or disorders of the immune system, by activating or inhibiting the 
proliferation, differentiation, or mobilization (chemotaxis) of immune cells. Immune 

— r " y*^>^^-> ^ uij ^ u n^inmupujLM>, piuuucmg niycioiu (platelets, red 

blood cells, neutrophils, and macrophages) and lymphoid (B and T lymphocytes) cells 
from plunpotent stem cells. The etiology of these immune deficiencies or disorders 
may be genetic, somatic, such as cancer or some autoimmune disorders, acquired (e^., 
by chemotherapy or toxins), or infectious. Moreover, a polynucleotide or polypeptide 
of the present invention can be used as a marker or detector of a particular immune 
system disease or disorder. 

A polynucleotide or polypeptide of the present invention may be useful in 
treating or detecting deficiencies or disorders of hematopoietic cells. A polypeptide or 
polynucleotide of the present invention could be used to increase differentiation and 
proliferation of hematopoietic cells, including the plunpotent stem cells, in an effort to 
treat those disorders associated with a decrease in certain (or many) types hematopoietic 
cells. Examples of immunologic deficiency syndromes include, but are not limited to: 
blood protein disorders (e.g. agammaglobulinemia, dysgammaglobulinemia), ataxia 
telangiectasia, common variable immunodeficiency, Digeorge Syndrome, HIV 
infection, HTLV-BLV infection, leukocyte adhesion deficiency syndrome, 
lymphopenia, phagocyte bactericidal dysfunction, severe combined immunodeficiency 
(SCIDs), Wiskott-Aldrich Disorder, anemia, thrombocytopenia, or hemoglobinuria. 

Moreover, a polypeptide or polynucleotide of the present invention could also 
be used to modulate hemostatic (the stopping of bleeding) or thrombolytic activity (clot 
formation). For example, by increasing hemostatic or thrombolytic activity, a 
polynucleotide or polypeptide of the n^^nt > nM . , . i i k,, j t . tt . ; , \ » , \ 
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decrease hemostatic or thrombolytic activity could be used to inhibit or dissolve 
clotting. These molecules could be important in the treatment of heart attacks 
(infarction), strokes, or scarring. 

A polynucleotide or polypeptide of the present invention may also be useful in 
:> treating or detecting autoimmune disorders. Many autoimmune disorders result from 
inappropriate recognition of self as foreign material by immune cells. This 
inappropriate recognition results in an immune response leading to the destruction of the 
host tissue. Therefore, the administration of a polypeptide or polynucleotide of the 
present invention that inhibits an immune response, particularly the proliferation, 

10 differentiation, or chemotaxis of T-cells, may be an effective therapy in preventing 
autoimmune disorders. 

Examples of autoimmune disorders that can be treated or detected by the present 
invention include, but are not limited to: Addison's Disease, hemolytic anemia, 
antiphospholipid syndrome, rheumatoid arthritis, dermatitis, allergic encephalomyelitis, 

15 glomerulonephritis, Goodpasture's Syndrome, Graves* Disease, Multiple Sclerosis, 
Myasthenia Gravis, Neuritis, Ophthalmia, Bullous Pemphigoid, Pemphigus, 
Polyendocnnopathies, Purpura, Reiter's Disease, Stiff-Man Syndrome, Autoimmune 
Thyroiditis, Systemic Lupus Erythematosus, Autoimmune Pulmonary Inflammation, 
Guillain-Barre Syndrome, insulin dependent diabetes mellitis, and autoimmune 

20 inflammatory eye disease. 

Similarly, allergic reactions and conditions, such as asthma (particularly allergic 
asthma) or other respiratory problems, may also be treated by a polypeptide or 
polynucleotide of the present invention. Moreover, these molecules can be used to treat 
anaphylaxis, hypersensitivity to an antigenic molecule, or blood group incompatibility. 

25 A polynucleotide or polypeptide of the present invention may also be used to 

treat and/or prevent organ rejection or graft-versus-host disease (GVHD). Organ 
rejection occurs by host immune cell destruction of the transplanted tissue through an 
immune response. Similarly, an immune response is also involved in GVHD, but, in 
this case, the foreign transplanted immune cells destroy the host tissues. The 

30 administration of a polypeptide or polynucleotide of the present invention that inhibits 
an immune response, particularly the proliferation, differentiation, or chemotaxis of T- 
cells, may be an effective therapy in preventing organ rejection or GVHD. 

Similarly, a polypeptide or polynucleotide of the present invention may also be 
used to modulate inflammation. For example, the polypeptide or polynucleotide may 

3> inhibit the proliferation and differentiation of cells involved in an inflammatory 

response. These molecules can be used to treat inflammatory conditions, both chronic 
and acute conditions, including inflammation associated with infection (e.s.. septic 
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shock, sepsis, or systemic inflammatory response syndrome (SIRS)), ischemia- 
reperfusion injury, endotoxin lethality, arthritis, complement-mediated hyperacute 
rejection, nephritis, cytokine or chemokine induced lung injury, inflammatory bowel 
disease. Crohn's disease, or resulting from over production of cytokines (e « TNF or 
5 IL-1.) 

Hyperproliferative Disorders 

A polypeptide or polynucleotide can be used to treat or detect hyperproliferat.ve 
disorders, including neoplasms. A polypeptide or polynucleotide of the present 
0 invention may inhibit the proliferation of the disorder through direct or indirect 

interactions. Alternatively, a polypeptide or polynucleotide of the present invention 
may proliferate other cells which can inhibit the hyperproliferative disorder. 

For example, by increasing an immune response, particularly increasing 
antigenic qualities of the hyperproliferative disorder or by proliferating, differentiating, 
or mobilizing T-cells, hyperproliferative disorders can be treated. This immune 
response may be increased by either enhancing an existing immune response, or by 
initiating a new immune response. Alternatively, decreasing an immune response may 
also be a method of treating hyperproliferative disorders, such as a chemotherapeutic 
agent. 

Examples of hyperproliferative disorders that can be treated or detected by a 
polynucleotide or polypeptide of the present invention include, but are not limited to 
neoplasms located in the: abdomen, bone, breast, digestive system, liver, pancreas, 
peritoneum, endocrine glands (adrenal, parathyroid, pituitary, testicles, ovary, thymus, 
thyroid), eye. head and neck, nervous (central and peripheral), lymphatic system, 
25 pelvic, skin, soft tissue, spleen, thoracic, and urogenital. 

Similarly, other hyperproliferative disorders can also be treated or detected by a 
polynucleotide or polypeptide of the present invention. Examples of such 
hyperproliferative disorders include, but are not limited to: hypergammaglobulinemia, 
lymphoproliferative disorders, paraproteinemias, purpura, sarcoidosis. Sezary 
Syndrome. Waldcnstron's Macroglobulinemia. Gaucher** Disease, histiocytosis, and 
any other hyperproliferative disease, besides neoplasia, located in an organ system 
listed above. 

Infectious Dise;iM> 



increasing the proliferation and differentiation of B and/or T cells, infectious diseases 
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may be treated. The immune response may be increased by either enhancing an existing 
immune response, or by initiating a new immune response. Alternatively, the 
polypeptide or polynucleotide of the present invention may also directly inhibit the 
infectious agent, without necessarily eliciting an immune response. 

Viruses are one example of an infectious agent that can cause disease or 
symptoms that can be treated or detected by a polynucleotide or polypeptide of the 
present invention. Kxamples of viruses, include, but are not limited to the following 
DNA and RNA viral families: Arbovirus, Adenoviridae, Arenaviridae, Arterivirus, 
Birnaviridae, Bunyaviridae, Caliciviridae, Circoviridae, Coronaviridae, Flavivindae, 
Hepadnaviridae (Hepatitis), Herpesvindae (such as. Cytomegalovirus, Herpes 
Simplex, Herpes Zoster), Mononegavirus (e.g., Paramyxoviridae, Morbillivirus, 
Rhabdovmdae), Oi thomyxoviridae (e.g.. Influenza), Papovaviridae, Parvoviridae, 
Picomavindae. Poxviridae (such as Smallpox or Vaccinia), Reovindae (e.g.. 
Rotavirus), Rctroviridae (HTLV-I, HTLV-II, Lentivirus). and Togaviridae (e.g., 
Rubivirus). Viruses falling within these families can cause a variety of diseases or 
symptoms, including, but not limited to: arthritis, bronchiolitis, encephalitis, eye 
infections (e.g., conjunctivitis, keratitis), chronic fatigue syndrome, hepatitis (A, B, C, 
E, Chrome Active, Delta), meningitis, opportunistic infections (e.g., AIDS), 
pneumonia, Burkitt's Lymphoma, chickenpox , hemorrhagic fever. Measles, Mumps, 
Parainfluenza, Rabies, the common cold, Polio, leukemia. Rubella, sexually 
transmitted diseases, skin diseases (e.g., Kaposi's, warts), and viremia. A polypeptide 
or polynucleotide of the present invention can be used to treat or detect any of these 
symptoms or diseases. 

Similarly, bacterial or fungal agents that can cause disease or symptoms and that 
can be treated or detected by a polynucleotide or polypeptide of the present invention 
include, but not limited to, the following Gram-Negative and Gram-positive bacterial 
tamilies and fungi: Actinomycetales (e.g., Corynebactenum, Mycobacterium, 
Norcardia), Aspergillosis, Bacillaceae (e.g., Anthrax, Clostridium). Bacteroidaceae, 
Blastomycosis, Bordetella. Borrelia, Brucellosis, Candidiasis. Campylobacter, 
Coccidioidomycosis. Cryptococcosis. Dermatocycoses, Enterobacteriaceae (Klebsiella, 
Salmonella, Serratia, Yersinia), Erysipelothnx, Helicobacter, Legionellosis, 
Leptospirosis, Listeria, Mycoplasmatales, Neissenaceae (e.g., Acinetobacter, 
Gonorrhea, Menigococcal), Pasteurellacea Infections (e.g., Actinobacillus, 
Heamophilus, Pasteurella). Pseudomonas, Rickcttsiaceae, Chlamydiaceae, Syphilis, 
and Staphylococcal. These bacterial or fungal families can cause the following diseases 
or symptoms, including, but not limited to: bacteremia, endocarditis, eye infections 
(conjunctivitis, tuberculosis, uveitis), gingivitis, opportunistic infections (e.g.. AIDS 
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related infections), paronychia, prosthesis-related infections, Reiter's Disease, 
respiratory tract infections, such as Whooping Cough or Empyema, sepsis, Lyme 
Disease, Cat-Scratch Disease, Dysentery. Paratyphoid Fever, food poisoning. 
Typhoid, pneumonia. Gonorrhea, meningitis, Chlamydia, Syphilis, Diphtheria, 
Leprosy, Paratuberculosis, Tuberculosis, Lupus, Botulism, gangrene, tetanus, 
impetigo, Rheumatic Fever, Scarlet Fever, sexually transmitted diseases, skin diseases 
(e.g., cellulitis, dermatocycoses), toxemia, urinary tract infections, wound infections. 
A polypeptide or polynucleotide of the present invention can be used to treat or detect 
any of these symptoms or diseases. 

Moreover, parasitic agents causing disease or symptoms that can be treated or 
detected by a polynucleotide or polypeptide of the present invention include, but not 
limited to, the following families: Amebiasis, Babesiosis, Coccidiosis, 
Cryptosporidiosis, Dieniamoebiasis, Dourine, Hctoparasiue. Giardiasis, Helminthiasis, 
Leishmaniasis, Theileriasis, Toxoplasmosis, Trypanosomiasis, and Trichomonas. 
These parasites can cause a variety of diseases or symptoms, including, but not limited 
to: Scabies, Trombiculiasis, eye infections, intestinal disease (e.g., dysentery, 
giardiasis), liver disease, lung disease, opportunistic infections (e.g., AIDS related), 
Malaria, pregnancy complications, and toxoplasmosis. A polypeptide or polynucleotide 
of the present invention can be used to treat or detect any of these symptoms or 
diseases. 

Preferably, treatment using a polypeptide or polynucleotide of the present 
invention could either be by administering an effective amount of a polypeptide to the 
patient, or by removing cells from the patient, supplying the ceils with a polynucleotide 
of the present invention, and returning the engineered cells to the patient (ex vivo 
therapy). Moreover, the polypeptide or polynucleotide of the present invention can be 
used as an antigen in a vaccine to raise an immune response against infectious disease. 

Regeneration 

A polynucleotide or polypeptide of the present invention can be used to 
differentiate, proliferate, and attract cells, leading to the regeneration of tissues. (See. 
Science 276:59-87 ( 1997).) The regeneration of tissues could be used to repair, 
replace, or protect tissue damaged by congenital defects, trauma (wounds, burns, 
incisions, or ulcers), age, disease (e.g. osteoporosis, osteocarthi itis, periodontal 

(e.g.. pancreas. liver, intestine, kidney, skin, endothelium), muscle (smooth, skeletal 
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or cardiac), vascular (including vascular endothelium), nervous, hematopoietic, and 
skeletal (bone, cartilage, tendon, and ligament) tissue. Preferably, regeneration occurs 
without or decreased scarring. Regeneration also may include angiogenesis. 

Moreover, a polynucleotide or polypeptide of the present invention may increase 
5 regeneration of tissues difficult to heal. For example, increased tendon/ligament 
regeneration would quicken recovery time alter damage. A polynucleotide or 
polypeptide of the present invention could also be used prophylactically in an effort to 
avoid damage. Specific diseases that could be treated include of tendinitis, carpal tunnel 
syndrome, and other tendon or ligament defects. A further example of tissue 

10 regeneration of non-healing wounds includes pressure ulcers, ulcers associated with 
vascular insufficiency, surgical, and traumatic wounds. 

Similarly, nerve and brain tissue could also be regenerated by using a 
polynucleotide or polypeptide of the present invention to proliferate and differentiate 
nerve cells. Diseases that could be treated using this method include central and 

1 5 peripheral nervous system diseases, neuropathies, or mechanical and traumatic 

disorders (e.g., spinal cord disorders, head trauma, cerebrovascular disease, and 
stoke). Specifically, diseases associated with peripheral nerve injuries, peripheral 
neuropathy (e.g., resulting from chemotherapy or other medical therapies), localized 
neuropathies, and central nervous system diseases (e.g., Alzheimer's disease, 

20 Parkinson's disease. Huntington's disease, amyotrophic lateral sclerosis, and Shy- 
Drager syndrome), could all be treated using the polynucleotide or polypeptide of the 
present invention. 

Chemotaxis 

25 A polynucleotide or polypeptide of the present invention may have chemotaxis 

activity. A chemotaxic molecule attracts or mobilizes cells (e.g., monocytes, 
fibroblasts, neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial 
cells) to a particular site in the body, such as inflammation, infection, or site of 
hyperprohferation. The mobilized cells can then fight off and/or heal the particular 

30 trauma or abnormality. 

A polynucleotide or polypeptide of the present invention may increase 
chemotaxic activity of particular cells. These chemotactic molecules can then be used to 
treat inflammation, infection, hyperproliferative disorders, or any immune system 
disorder by increasing the number of cells targeted to a particular location in the body. 

35 For example, chemotaxic molecules can be used to treat wounds and other trauma to 

tissues by attracting immune cells to the injured location. Chemotactic molecules of the 
present invention can also attract fibroblasts, which can be used to treat wounds. 
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It is also contemplated that a polynucleotide or polypeptide of the present 
invention may inhibit chemotactic activity. These molecules could also be used to treat 
disorders. Thus, a polynucleotide or polypeptide of the present invention could be used 
as an inhibitor of chemotaxis. 

5 

Binding Activity 

A polypeptide of the present invention may be used to screen for molecules that 
bind to the polypeptide or for molecules to which the polypeptide binds. The binding 
of the polypeptide and the molecule may activate (agonist ), increase, inhibit 
10 (antagonist), or decrease activity of the polypeptide or the molecule bound. Examples 
of such molecules include antibodies, oligonucleotides, proteins (e.g., receptors), or 
small molecules. 

Preferably, the molecule is closely related to the natural liquid of the 
polypeptide, e.g., a fragment of the hgand, or a natural substrate, a ligand, a structural 

15 or functional mimetic. (See, Coligan et al.. Current Protocols in Immunology 

I(2):Chapter 5 ( 1991).) Similarly, the molecule can be closely related to the natural 
receptor to which the polypeptide binds, or at least, a fragment of the receptor capable 
of being bound by the polypeptide (e.g., active site). In either case, the molecule can 
be rationally designed using known techniques. 

20 Preferably, the screening for these molecules involves producing appropriate 

cells which express the polypeptide, either as a secreted protein or on the cell 
membrane. Preferred cells include cells from mammals, yeast, Drosophila, or coli. 
("ells expressing the polypeptide (or cell membrane containing the expressed 
polypeptide) are then preferably contacted with a test compound potentially containing 

25 the molecule to observe binding, stimulation, or inhibition of activity of either the 
polypeptide or the molecule. 

The assay may simply test binding of a candidate compound to the polypeptide, 
wherein binding is detected by a label, or in an assay involving competition with a 
labeled competitor. Further, the assay may test whether the candidate compound results 

30 m a signal generated by binding to the polypeptide. 

Alternatively, the assay can be carried out using cell-free preparations, 
polypeptide/molecule affixed to a solid support, chemical libraries, or natural product 
mixtures. The assay may also simply comprise the steps of mixing a candidate 
compound with a Mention cont;iinin<* n 0 i v p"pt ; ■ <■ i /.. ■ 
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Preferably, an HLISA assay can measure polypeptide level or activity in a 
sample (e.g., biological sample) using a monoclonal or polyclonal antibody. The 
antibody can measure polypeptide level or activity by either binding, directly or 
indirectly, to the polypeptide or by competing with the polypeptide for a substrate. 

All of these above assays can be used as diagnostic or prognostic markers. The 
molecules discovered using these assays can be used to treat disease or to bring about a 
particular result in a patient (e.g.. blood vessel growth) by activating or inhibiting the 
polypeptide/moleculc. Moreover, the assays can discover agents which may inhibit or 
enhance the production of the polypeptide from suitably manipulated cells or tissues. 

Therefore, the invention includes a method of identifying compounds which 
bind to a polypeptide of the invention comprising the steps of: (a) incubating a 
candidate binding compound with a polypeptide of the invention; and (b) determining if 
binding has occurred. Moreover, the invention includes a method of identifying 
agonists/antagonists comprising the steps of: (a) incubating a candidate compound with 
a polypeptide of the invention, (b) assaying a biological activity , and (b) determining if 
a biological activity of the polypeptide has been altered. 

Other Activities 

A polypeptide or polynucleotide of the present invention may also increase or 
decrease the differentiation or proliferation of embryonic stem cells, besides, as 
discussed above, hematopoietic lineage. 

A polypeptide or polynucleotide of the present invention may also be used to 
modulate mammalian characteristics, such as body height, weight, hair color, eye color, 
skin, percentage of adipose tissue, pigmentation, size, and shape (e.g., cosmetic 
surgery ). Similarly, a polypeptide or polynucleotide of the present invention may be 
used to modulate mammalian metabolism affecting catabolism, anabolism, processing, 
utilization, and storage of energy. 

A polypeptide or polynucleotide of the present invention may be used to change 
a mammal's mental state or physical state by influencing biorhythms, caricadic 
rhythms, depression (including depressive disorders), tendency for violence, tolerance 
for pain, reproductive capabilities (preferably by Activin or Inhibin-like activity), 
hormonal or endocrine levels, appetite, libido, memory, stress, or other cognitive 
qualities. 

A polypeptide or polynucleotide of the present invention may also be used as a 
food additive or preservative, such as to increase or decrease storage capabilities, fat 
content, lipid, protein, carbohydrate, vitamins, minerals, cofactors or other nutritional 
components. 
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Other Preferred Embodiments 

Other preferred embodiments of the claimed invention include an isolated 
nucleic acid molecule comprising a nucleotide sequence which is at least 95% identical 
to a sequence of at least about 50 contiguous nucleotides in the nucleotide sequence of 
SEQ ID NO:X wherein X is any integer as defined in Table L 

Also preferred is a nucleic acid molecule wherein said sequence of contiguous 
nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the range of 
positions beginning with the nucleotide at about the position of the 5* Nucleotide of the 
Clone Sequence and ending with the nucleotide at about the position of the 3' 
Nucleotide of the Clone Sequence as defined for SEQ ID NO:X in Table 1 . 

Also preferred is a nucleic acid molecule wherein said sequence of contiguous 
nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the range of 
positions beginning with the nucleotide at about the position of the 5' Nucleotide of the 
Start Codon and ending with the nucleotide at about the position of the 3' Nucleotide of 
the Clone Sequence as defined for SEQ ID NO:X in Table 1. 

Similarly preferred is a nucleic acid molecule wherein said sequence of 
contiguous nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the 
range of positions beginning with the nucleotide at about the position of the 5^ 
Nucleotide of the First Amino Acid of the Signal Peptide and ending with the nucleotide 
at about the position of the _T Nucleotide of the Clone Sequence as defined for SEQ ID 
NO:X in Table 1. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a sequence of at least about 150 contiguous 
nucleotides in the nucleotide sequence of SEQ ID NO:X. 

Further preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a sequence of at least about 500 contiguous 
nucleotides in the nucleotide sequence of SEQ ID NO:X. 

A further preferred embodiment is a nucleic acid molecule comprising a 
nucleotide sequence which is at least 95% identical to the nucleotide sequence of SEQ 
ID NO:X beginning with the nucleotide at about the position of the 5' Nucleotide of the 
First Ammo Acid of the Signal Peptide and ending with the nucleotide at about the 
position of the V Nucleotide of the Clone Sequence as defined for SFO ID XO \ m 
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A further preferred embodiment is an isolated nucleic acid molecule comprising 
a nucleotide sequence which is at least 95% identical to the complete nucleotide 
sequence of SEQ ID NO:X. 

Also preferred is an isolated nucleic acid molecule which hybridizes under 
stringent hybridization conditions to a nucleic acid molecule, wherein said nucleic acid 
molecule which hybridizes does not hybridize under stringent hybridization conditions 
to a nucleic acid molecule having a nucleotide sequence consisting of only A residues or 
of only T residues. 

Also preferred is a composition of matter comprising a DNA molecule which 
comprises a human cDNA clone identified by a cDNA Clone Identifier in Table 1, 
which DNA molecule is contained in the material deposited with the American Type 
Culture Collection and given the ATCC Deposit Number shown in Table 1 for said 
cDNA Clone Identifier. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
15 sequence which is at least 95% identical to a sequence of at least 50 contiguous 

nucleotides in the nucleotide sequence of a human cDNA clone identified by a cDNA 
Clone Identifier in Table 1 , which DNA molecule is contained in the deposit given the 
ATCC Deposit Number shown in Table 1 . 

Also preferred is an isolated nucleic acid molecule, wherein said sequence of at 
20 least 50 contiguous nucleotides is included in the nucleotide sequence of the complete 
open reading frame sequence encoded by said human cDNA clone. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to sequence of at least 150 contiguous 
nucleotides in the nucleotide sequence encoded by said human cDNA clone. 
25 A further preferred embodiment is an isolated nucleic acid molecule comprising 

a nucleotide sequence which is at least 95% identical to sequence of at least 500 
contiguous nucleotides in the nucleotide sequence encoded by said human cDNA clone. 

A further preferred embodiment is an isolated nucleic acid molecule comprising 
a nucleotide sequence which is at least 95% identical to the complete nucleotide 
30 sequence encoded by said human cDNA clone. 

A further preferred embodiment is a method for detecting in a biological sample 
a nucleic acid molecule comprising a nucleotide sequence which is at least 95% identical 
to a sequence of at least 50 contiguous nucleotides in a sequence selected from the 
group consisting of: a nucleotide sequence of SEQ ID NO:X wherein X is any integer 
35 as defined in Table 1; and a nucleotide sequence encoded by a human cDNA clone 

identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1; which method 
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comprises a step of comparing a nucleotide sequence of at least one nucleic acid 
molecule in said sample with a sequence selected from said group and determining 
whether the sequence of said nucleic acid molecule in said sample is at least 95% 
identical to said selected sequence. 

Also preferred is the above method wherein said step of comparing sequences 
comprises determining the extent of nucleic acid hybridization between nucleic acid 
molecules in said sample and a nucleic acid molecule comprising said sequence selected 
from said group. Similarly, also preferred is the above method wherein said step of 
comparing sequences is performed by comparing the nucleotide sequence determined 
from a nucleic acid molecule in said sample with said sequence selected from said 
group. The nucleic acid molecules can comprise DNA molecules or RNA molecules. 

A further preferred embodiment is a method for identifying the species, tissue or 

^ r „ v . „ .'"'M } ;it v> iii^n uivuiuu eunijjiiMjs a siep oi detecting nucleic acid 

molecules in said sample, if any, comprising a nucleotide sequence that is at least 95% 
identical to a sequence of at least 50 contiguous nucleotides in a sequence selected from 
the group consisting of: a nucleotide sequence of SEQ ID NO:X wherein X is any 
integer as defined in Tabic 1; and a nucleotide sequence encoded by a human cDNA 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
the ATCC Deposit Number shown for said cDNA clone in Table 1 . 

The method for identifying the species, tissue or cell type of a biological sample 
can comprise a step of detecting nucleic acid molecules comprising a nucleotide 
sequence in a panel of at least two nucleotide sequences, wherein at least one sequence 
in said panel is at least 95% identical to a sequence of at least 50 contiguous nucleotides 
in a sequence selected from said group. 

Also preferred is a method for diagnosing in a subject a pathological condition 
associated with abnormal structure or expression of a gene encoding a secreted protein 
identified in Table 1, which method comprises a step of detecting in a biological sample 
obtained from said subject nucleic acid molecules, if any. comprising a nucleotide 
sequence that is at least 95% identical to a sequence of at least 50 contiguous 
nucleotides in a sequence selected from the group consisting of: a nucleotide sequence 
of SHQ ID NO:X wherein X is any integer as defined in Table 1; and a nucleotide 
sequence encoded by a human cDN A clone identified by a cDNA Clone Identifier in 
Table 1 and contained in the deposit with the ATCC Deposit Number shown for said 
( D\ r \ clone in 'fable 1 
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identical to a sequence of at least 50 contiguous nucleotides in a sequence selected from 
said group. 

Also preferred is a composition of matter comprising isolated nucleic acid 
molecules wherein the nucleotide sequences of said nucleic acid molecules comprise a 
panel of at least two nucleotide sequences, wherein at least one sequence m said panel is 
at least 95% identical to a sequence of at least 50 contiguous nucleotides in a sequence 
selected from the group consisting of: a nucleotide sequence of SEQ ID NO:X wherein 
X is any integer as defined in Table 1 ; and a nucleotide sequence encoded by a human 
cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained in the 
deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1. The 
nucleic acid molecules can comprise DNA molecules or RNA molecules. 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 9(Yb identical to a sequence of at least about 10 contiguous amino acids in the 
amino acid sequence of SEQ ID NO: Y wherein Y is any integer as defined in Table 1 . 

Also preferred is a polypeptide, wherein said sequence of contiguous amino 
acids is included in the amino acid sequence of SEQ ID NO: Y in the range of positions 
beginning with the residue at about the position of the First Amino Acid of the Secreted 
Portion and ending with the residue at about the Last Amino Acid of the Open Readme 
Frame as set forth for SEQ ID NO:Y in Table L 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95 c b identical to a sequence of at least about 30 contiguous amino acids in the 
amino acid sequence of SEQ ID NO:Y. 

Further preferred is an isolated polypeptide comprising an amino acid sequence 
at least 95% identical to a sequence of at least about 100 contiguous amino acids in the 
amino acid sequence of SEQ ID NO:Y. 

Further preferred is an isolated polypeptide comprising an amino acid sequence 
at least 95% identical to the complete amino acid sequence of SEQ ID NO:Y. 

Further preferred is an isolated polypeptide comprising an amino acid sequence 
at least 90 c b identical to a sequence of at least about 10 contiguous amino acids in the 
complete amino acid sequence of a secreted protein encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1. 

Also preferred is a polypeptide wherein said sequence of contiguous amino 
acids is included in the amino acid sequence of a secreted portion of the secreted protein 
encoded by a human cDNA clone identified by a cDNA Clone Identifier in Table 1 and 
contained in the deposit with the ATCC Deposit Number shown for said cDNA clone in 
Table 1 . 



WO 98/54963 



PCT/US98/11422 



221 



10 



Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to a sequence of at least about 30 contiguous amino acids in the 
amino acid sequence of the secreted portion of the protein encoded by a human cDNA 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
the ATCC Deposit Number shown for said cDNA clone in Table 1. 

Also preferred is an isolated polypeptide comprising an ammo acid sequence at 
least 95% identical to a sequence of at least about 100 contiguous amino acids in the 
amino acid sequence of the secreted portion of the protein encoded by a human cDNA 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
the ATCC Deposit Number shown for said cDNA clone in Table 1. 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to the amino acid sequence of the secreted portion of the protein 
~..~v.«ww . v c,mu„iuiill/iu\liuiil lutuiuicu uy a luina Lionc Kienuner in table 1 and 
contained in the deposit with the ATCC Deposit Number shown for said cDNA clone in 
15 Table 1. 

Further preferred is an isolated antibody which binds specifically to a 
polypeptide comprising an amino acid sequence that is at least 90% identical to a 
sequence of at least 10 contiguous amino acids in a sequence selected from the group 
consisting of: an amino acid sequence of SEQ ID NO;Y wherein Y is any integer as 

20 defined in Table 1 ; and a complete amino acid sequence of a protein encoded by a 

human cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained in 
the deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1. 

Further preferred is a method for detecting in a biological sample a polypeptide 
comprising an amino acid sequence which is at least 90% identical to a sequence of at 

25 least 10 contiguous amino acids in a sequence selected from the group consisting of: an 
amino acid sequence of SEQ ID NO: Y wherein Y is any integer as defined in Table 1 ; 
and a complete ammo acid sequence of a protein encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1; which method 

30 comprises a step of comparing an amino acid sequence of at least one polypeptide 
molecule in said sample with a sequence selected from said group and determining 
whether the sequence of said polypeptide molecule in said sample is at least 90% 
identical to said sequence of at least 10 contiguous ammo acids. 
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comprising an amino acid sequence that is at least 90% identical to a sequence of at least 
10 contiguous amino acids in a sequence selected from the group consisting of: an 
amino acid sequence of SEQ ID NO:Y wherein Y is any integer as defined in Table 1; 
and a complete amino acid sequence of a protein encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Also preferred is the above method wherein said step of comparing sequences is 
performed by comparing the amino acid sequence determined from a polypeptide 
molecule in said sample with said sequence selected from said group. 

Al so preferred is a method for identifying the species, tissue or cell type of a 
biological sample which method comprises a step of detecting polypeptide molecules in 
said sample, if any, comprising an amino acid sequence that is at least 90% identical to 
a sequence of at least 10 contiguous amino acids in a sequence selected from the group 
consisting of: an amino acid sequence of SEQ ID NO: Y wherein Y is any integer as 
defined in Table 1 ; and a complete amino acid sequence of a secreted protein encoded 
by a human cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained 
in the deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Also preferred is the above method for identifying the species, tissue or cell type 
of a biological sample, which method comprises a step of detecting polypeptide 
molecules comprising an amino acid sequence in a panel of at least two amino acid 
sequences, wherein at least one sequence in said panel is at least 90% identical to a 
sequence of at least 10 contiguous amino acids in a sequence selected from the above 
group. 

Al so preferred is a method lor diagnosing in a subject a pathological condition 
associated with abnormal structure or expression of a gene encoding a secreted protein 
identified in Table 1, which method comprises a step of detecting in a biological sample 
obtained from said subject polypeptide molecules comprising an amino acid sequence in 
a panel of at least two amino acid sequences, wherein at least one sequence in said panel 
is at least 90% identical to a sequence of at least 10 contiguous amino acids in a 
sequence selected from the group consisting of: an amino acid sequence of SEQ ID 
NO: Y wherein Y is any integer as defined in Table 1 ; and a complete amino acid 
sequence of a secreted protein encoded by a human cDNA clone identified by a cDN A 
Clone Identifier in Table 1 and contained in the deposit with the ATCC Deposit Number 
shown for said cDNA clone in Table 1 . 

In any of these methods, the step of detecting said polypeptide molecules 
includes using an antibody. 
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Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 957c identical to a nucleotide sequence encoding a 
polypeptide wherein said polypeptide comprises an amino acid sequence that is at least 
90% identical to a sequence of at least 10 contiguous amino acids in a sequence selected 
from the group consisting of: an amino acid sequence of SEQ ID NO:Y wherein Y is 
any integer as defined in Table 1 ; and a complete amino acid sequence of a secreted 
protein encoded by a human cDNA clone identified by a cDNA Clone Identifier in Table 
1 and contained in the deposit with the ATCC Deposit Number shown for said cDNA 
clone in Table 1 . 

Also preferred is an isolated nucleic acid molecule, wherein said nucleotide 
sequence encoding a polypeptide has been optimized for expression of said polypeptide 
in a prokaryotic host. 

A.! so preferred is an isolated nucleic acid moiecuie. wherein said polypeptide 
comprises an amino acid sequence selected f rom the group consisting of: an amino acid 
sequence of SEQ ID NO:Y wherein Y is any integer as defined in Table 1; and a 
complete amino acid sequence of a secreted protein encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in 1 able 1 . 

Further preferred is a method of making a recombinant vector comprising 
inserting any of the above isolated nucleic acid molecule into a vector. Also preferred is 
the recombinant vector produced by this method. Also preferred is a method of making 
a recombinant host cell comprising introducing the vector into a host cell, as well as the 
recombinant host cell produced by this method. 

Also preferred is a method of making an isolated polypeptide comprising 
culturing this recombinant host cell under conditions such that said polypeptide is 
expressed and recovering said polypeptide. Also preferred is this method of making an 
isolated polypeptide, wherein said recombinant host cell is a eukaryotic cell and said 
polypeptide is a secreted portion of a human secreted protein comprising an ammo acid 
sequence selected from the group consisting of: an amino acid sequence of SEQ ID 
NO: Y beginning with the residue at the position of the First Amino Acid of the Secreted 
Portion of SHQ ID NO:Y wherein Y is an integer set forth in Table 1 and said position 
of the First Amino Acid of the Secreted Portion of SEQ ID NO:Y is defined in Table 1; 
and an amino acid sequence of a secreted portion of a protein encoded by a human 
cDNA clone identifV! V- < n\ \ ri - ■ T ' • r • • . • 
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Also preferred is a method of treatment of an individual in need of an increased 
level of a secreted protein activity, which method comprises administering to such an 
individual a pharmaceutical composition comprising an amount of an isolated 
polypeptide, polynucleotide, or antibody of the claimed invention effective to increase 
5 the level of said protein activity in said individual. 

Having generally described the invention, the same will be more readily 
understood by reference to the following examples, which are provided by way of 
illustration and are not intended as limiting. 

10 Examples 

Example 1: Isolation of a Selected cDNA Clone From the Deposited 
Sa m pie 

Each cDNA clone in a cited ATCC deposit is contained in a plasmid vector. 
15 Table 1 identifies the vectors used to construct the cDNA library from which each clone 
was isolated. In many cases, the vector used to construct the library is a phage vector 
from which a plasmid has been excised. The table immediately below correlates the 
related plasmid lor each phage vector used in constructing the cDNA library. For 
example, where a particular clone is identified in Table 1 as being isolated in the vector 
20 "Lambda Zap," the corresponding deposited clone is in "pBluescript." 

Vecto r Used to Construct Library Corresponding Deposited Plasmid 

Lambda Zap pBluescript (pBS) 

Uni-Zap XR pBluescript (pBS) 

Zap Express pBK 
25 lafmid BA plafmid BA 

pSportl pSportl 
pCMVSport 2.0 pCMVSport 2.0 

pCMVSport 3.0 pCMVSport 3.0 

pCR*2.1 pCR tM 2.1 
30 Vectors Lambda Zap (U.S. Patent Nos. 5,128,256 and 5,286,636), Uni-Zap 

XR (U.S. Patent Nos. 5,128, 256 and 5,286,636), Zap Express (U.S. Patent Nos. 
5,128,256 and 5,286,636), pBluescript (pBS) (Short, J. M. et al., Nucleic Acids Res. 
16:7583-7600 (1988); Alting-Mees, M. A. and Short, J. M., Nucleic Acids Res. 
17:9494 (1989)) and pBK (Alting-Mees, M. A. et al., Strategies 5:58-61 (1992)) are 
35 commercially available from Stratagene Cloning Systems, Inc., 1 101 1 N. Torrey Pines 
Road, La Jolla, CA, 92037. pBS contains an ampicillin resistance gene arid pBK 
contains a neomycin resistance gene. Both can be transformed into E. coli strain XL-1 
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Blue, also available from Stratagene. pBS comes in 4 forms SK+. SK-. KS+ and KS. 
The S and K refers to the orientation of the polylinker to the T7 and T3 primer 
sequences winch flank the polylinker region ("S" is for Sad and "K" is for Kpnl which 
are the first sites on each respective end of the linker). "+" or "-" refer to the orientation 
of the fl origin of replication ("on"), such that in one orientation, single stranded rescue 
initiated from the f 1 ori generates sense strand DNA and in the other, antisense. 

Vectors pSportl, pCMVSport 2.0 and pCMVSport 3.0. were obtained from 
Life Technologies. Inc.. P. O. Box 6009. Gaithersburg, Ml) 20807. All Sport vectors 
contain an ampicillin resistance gene and may be transformed into E. coh strain 
DH10B. also available from Life Technologies. (See. for instance, Gruber. C. E., et 
al.. Focus 15:59 (1993).) Vector lafmid BA (Bento Soares. Columbia University. NY) 
contains an ampicillin resistance gene and can be transformed into E. coh strain XL-1 

Blue. Vector nCR""'"> 1 x»;hifh ; t - i_ r- 

— ~" '■• Hum iiivuiwgeu. iuow raraoay /wenue, 

Carlsbad. CA 92008, contains an ampicillin resistance gene and may be transformed 
15 mto E. coli strain DH10B, available from Life Technologies. (See. for instance. Clark, 
J. M., Nuc. Acids Res. 16:9677-9686 ( 1988) and Mead, D. et al.. Biotechnology 9: ' 
( 1 99 1 ).) Preferably, a polynucleotide of the present invention does not comprise the 
phage vector sequences identified for the particular clone in Table 1, as well as the 
corresponding plasmid vector sequences designated above. 

The deposited material in the sample assigned the ATCC Deposit Number cited 
in Table 1 for any given cDNA clone also may contain one or more additional plasmids, 
each compr.sing a cDNA clone different from that given clone. Thus, deposits sharing 
the same ATCC Deposit Number contain at least a plasmid for each cDNA clone 
identified in Table 1. Typically, each ATCC deposit sample cited in Table 1 comprises 
a mixture of approximately equal amounts (by weight) of about 50 plasmid DNAs, each 
containing a different cDNA clone: but such a deposit sample may include plasmids for 
more or less than 50 cDNA clones, up to about 500 cDNA clones. 

Two approaches can be used to isolate a particular clone from the deposited 
sample of plasmid DNAs cited for that clone in Table 1 . First, a plasmid is directly 
isolated by screening the clones using a polynucleotide probe corresponding: to SEQ ID 
NO:X. 

Particularly, a specific polynucleotide with 30-40 nucleotides is synthesized 
using an Applied Biosystems DNA synthesizer according to the sequence icported. 
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The plasmid mixture- is transformed into a suitable host, as indicated above (such as 
XL-1 Blue (Stratagenc)) using techniques known to those of skill in the art. such as 
those provided by the vector supplier or in related publications or patents cited above. 
The transformants are plated on 1.5% agar plates (containing the appropriate selection 
agent, e g., ampicillin) to a density of about 150 transformants (colonies) per plate. 
These plates are screened using Nylon membranes according to routine methods for 
bacterial colony screening (e.g., Sambrook et al.. Molecular Cloning: A Laboratory 
Manual. 2nd Edit., (1989), Cold Spring Harbor Laboratory Press, pages 1.93 to 
1.104), or other techniques known to those of skill in the art. 

Alternatively, two primers of 17-20 nucleotides derived from both ends of the 
SEQ ID NO.X (i.e.. within the region of SEQ ID NO:X bounded by the 5' NT and the 
3' NT ol the clone defined in Table 1 ) are synthesized and used to amplify the desired 
cDNA using the deposited cDNA plasmid as a template. The polymerase chain reaction 
is carried out under routine conditions, for instance, in 25 |al of reaction mixture with 
15 0.5 ug of the above cDNA template. A convenient reaction mixture is 1 .5-5 mM 

MgCL. 0.01% (w/v) gelatin. 20 fiM each of dATP. dCTP, dGTP. dTTP. 25 pmol of 
each primer and 0.25 Unit of Taq polymerase. Thirty five cycles of PCR (denaluration 
at 94°C for 1 min; annealing at 55°C for 1 min; elongation at 72°C for 1 min) arc 
performed with a Perkin-Elmer Cetus automated thermal cycler. The amplified product 
20 is analyzed by agarose gel electrophoresis and the DNA band with expected molecular 
weight is excised and purified. The PCR product is verified to be the selected sequence 
by subcloning and sequencing the DNA product. 

Several methods are available for the identification of the 5* or 3' non-coding 
portions of a gene which may not be present in the deposited clone. These methods 
25 include but are not limited to, filter probing, clone enrichment using specific probes, 

and protocols similar or identical to 5' and 3' "RACE" protocols which are well known 
in the art. For instance, a method similar to 5" RACE is available for generating the 
missing 5' end of a desired full-length transcript. (Fromont-Racine et al.. Nucleic Acids 
Res. 21(7):1683-1684 (1993).) 
30 Briefly, a specific RNA oligonucleotide is ligated to the 5' ends of a population 

of RNA presumably containing full-length gene RNA transcripts. A primer set 
containing a primer specific to the ligated RNA oligonucleotide and a primer specific to 
a known sequence of the gene of interest is used to PCR amplify the 5' portion of the 
desired full-length gene. This amplified product may then be sequenced and used to 
35 generate the full length gene. 
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This above method stans with total RNA isolated from the desired source, 
although poly-A+ RNA can be used. The RNA preparation can then be treated with 
phosphatase if necessary to eliminate 5' phosphate groups on degraded or damaged 
RNA which may interfere with the later RNA hgase step. The phosphatase should then 
be inactivated and the RNA treated with tobacco acid pyrophosphatase in order to 
remove the cap structure present at the 5" ends of messenger RNAs. This reaction 
leaves a 5' phosphate group at the 5* end of the cap cleaved RNA which can then be 
ligated to an RNA oligonucleotide using T4 RNA hgase. 

This modified RNA preparation is used as a template for first strand cDNA 
synthesis using a gene specific oligonucleotide. The first strand synthesis reaction is 
used as a template for PGR amplification of the desired 5' end using a primer specific to 
the ligated RNA oligonucleotide and a primer specific to the known sequence of the 
gene of interest. The resultant product is then sequenced and analyzed to confirm that 
the 5' end sequence belongs to the desired gene. 

Example 2: Isolation of Genomic Clones Corresponding to a 

Polynucleotide 

A human genomic PI library (Genomic Systems, Inc.) is screened by PCR 
using primers selected for the cDNA sequence corresponding to SEQ ID NO:X., 
20 according to the method described in Example 1 . (See also, Sambrook.) 

Example 3: Tissue Distribution of Polypeptide 

Tissue distribution of mRNA expression of polynucleotides of the present 
invention is determined using protocols for Northern blot analysis, described by, 

2^ among others. Sambrook et al. For example, a cDNA probe produced by the method 
described in Example 1 is labeled with P' 2 using the rediprime™ DNA labeling system 
( Amersham Life Science), according to manufacturer's instructions. After labeling, the 
probe is purified using CHROMA SPIN-100 IM column (Clontcch Laboratories. Inc.), 
according to manufacturer's protocol number PT 1200-1. The purified labeled probe is 

30 then used to examine various human tissues for mRNA expression. 

Multiple Tissue Northern (MTN) blots containing various human tissues (H) or 
human immune system tissues (IM) (Clontcch) are examined with the labeled probe 
using HxpressMyb™ hybridization solution (Clontcch) according to manufacturer's 

t M" o t ( vol nrirlv,, pT | 1 OH 1 r 11 • • : - . i . i . . • i . . . . , • . . ■ 
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Example 4: Chromosomal Mapping of the Polynucleotides 

An oligonucleotide primer set is designed according to the sequence at the 5' 
end of SEQ ID NO:X. This primer preferably spans about 100 nucleotides. This 
primer set is then used in a polymerase chain reaction under the following set of 
conditions : 30 seconds, 95°C; 1 minute, 56°C; 1 minute, 70°C. This cycle is repeated 
32 times followed by one 5 minute cycle at 70°C. Human, mouse, and hamster DNA 
is used as template in addition to a somatic cell hybrid panel containing individual 
chromosomes or chromosome fragments (Bios, Inc). The reactions is analyzed on 
either 8% polyacrylamide gels or 3.5 % agarose gels. Chromosome mapping is 
determined by the presence of an approximately 100 bp PCR fragment in the particular 
somatic cell hybrid. 

Exa mple 5: Bacterial Expression of a Polypeptide 

1 5 A polynucleotide encoding a polypeptide of the present invention is amplified 

using PCR oligonucleotide primers corresponding to the 5' and 3' ends of the DNA 
sequence, as outlined in Example 1, to synthesize insertion fragments. The primers 
used to amplify the cDNA insert should preferably contain restriction sites, such as 
BamHI and Xbal, at the 5' end of the primers in order to clone the amplified product 

20 into the expression vector. For example, BamHI and Xbal correspond to the restriction 
enzyme sites on the bacterial expression vector pQE-9. (Qiagen, Inc., Chatsworth, 
CA). This plasmid vector encodes antibiotic resistance (Amp r ), a bacterial origin of 
replication (on), an IPTG-regulatable promoter/operator (P/O), a ribosome binding site 
(RBS), a 6-histidine tag (6-His), and restriction enzyme cloning sites. 

25 The vector is digested with BamHI and Xbal and the amplified fragment 

is ligated into the pQE-9 vector maintaining the reading frame initiated at the bacterial 
RBS. The ligation mixture is then used to transform the E. coli strain M 15/rep4 
(Qiagen, Inc.) which contains multiple copies of the plasmid pREP4, which expresses 
the lad repressor and also confers kanamycin resistance ( Kan r ). Transformants are 

30 identified by their ability to grow on LB plates and ampicilhn/kanamycin resistant 

colonies are selected. Plasmid DNA is isolated and confirmed by restriction analysis. 

Clones containing the desired constructs are grown overnight (O/N) in liquid 
culture in LB media supplemented with both Amp (100 ug/mJ) and Kan (25 ug/ml). 
The O/N culture is used to inoculate a large culture at a ratio of 1: 100 to 1 :250. The 

35 cells are grown to an optical density 600 (O.D. 600 ) of between 0.4 and 0.6. IPTG 
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(Isopropyl-B-D-thiogalacto pyranoside) is then added to a final concentration of i mM. 
IPTG induces by inactivating the lad repressor, clearing the P/O leading to increased 
gene expression. 

Cells are grown for an extra 3 to 4 hours. Cells are then harvested by 
centrifugation (20 mins at 6000Xg). The cell pellet is solubilized in the chaotropic 
agent 6 Molar Guanidine HC1 by stirring for 3-4 hours at 4°C. The cell debris is 
removed by centnfugation, and the supernatant containing the polypeptide is loaded 
onto a nickd-nitrilo-tri-acetic acid ("Ni-NTA") affinity resin column (available from 
QIAGEN, Inc., supra). Proteins with a 6 x His tag bind to the Ni-NTA resin with high 
affinity and can be purified in a simple one-step procedure (for details see: The 
QIAexpresstonist (1995) QIAGEN, Inc., supra). 

Briefly, the supernatant is loaded onto the column in 6 M £uanidine-HCl. pH 8. 
the column is first washed with 10 volumes of 6 M guanidine-HCl, pH 8, then washed 
with 10 volumes of 6 M guanidine-HCl pH 6, and finally the polypeptide is eluted with 
6 M guanidine-HCl, pH 5. 

The purified protein is then renatured by dialyzing it against phosphate-buffered 
saline (PBS) or 50 mM Na-acctate, pH 6 buffer plus 200 mM NaCI. Alternatively, the 
protein can be successfully refolded while immobilized on the Ni-NTA column. The 
recommended conditions are as follows: rcnature using a linear 6M-1M urea gradient in 
500 mM NaCI, 20% glycerol, 20 mM Tris/HCl pH 7.4. containing protease inhibitors. 
The renaturation should be performed over a period of 1 .5 hours or more. After 
renaturation the proteins are eluted by the addition of 250 mM immidazole. Immidazole 
is removed by a final dialyzing step against PBS or 50 mM sodium acetate pH 6 buffer 
plus 200 mM NaCI. The purified protein is stored at 4°C or frozen at -80° C. 

In addition to the above expression vector, the present invention further includes 
an expression vector comprising phage operator and promoter elements operatively 
linked to a polynucleotide of the present invention, called pHE:4a. ( ATCC Accession 
Number 209645. deposited on February 25, 1998.) This vector contains: 1 ) a 
neomycinphosphotransferase gene as a selection marker, 2) an E. coli origin of 
replication, 3) a T5 phage promoter sequence, 4) two lac operator sequences, 5) a 
Shine-Delgarno sequence, and 6) the lactose operon repressor gene (laclq). The origin 
of replication (oriC) is derived from P UC19 (LTI, Gaithcrsburg, MD). The promoter 
sequence and operator sequences are made synthetically. 
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insert is generated according to the PCR protocol described in Example 1, using PCR 
primers having restriction sites for Ndel (5' primer) and Xbal, BamHI. Xhol, or 
Asp718 (3' primer). The PCR insert is gel purified and restricted with compatible 
enzymes. The insert and vector are ligated according to standard protocols. 
5 The engineered vector could easily be substituted in the above protocol to 

express protein in a bacterial system. 

Example 6: Purification of a Polypeptide from an Inclusion Body 

The following alternative method can be used to purify a polypeptide expressed 
10 in E coli when it is present in the form of inclusion bodies. Unless otherwise specified, 
all of the following steps are conducted at 4-10°C. 

Upon completion of the production phase of the E. coli fermentation, the cell 
culture is cooled to 4-10°C and the cells harvested by continuous ccntrifugation at 
15,000 rpm (Heraeus Sepatech). On the basis of the expected yield of protein per unit 
15 weight of cell paste and the amount of purified protein required, an appropriate amount 
of cell paste, by weight, is suspended in a buffer solution containing 100 mM Tris, 50 
mM EDTA, pH 7.4. The cells are dispersed to a homogeneous suspension using a 
high shear mixer. 

The cells are then lysed by passing the solution through a microfluidizer 
20 (Microfuidics, Corp. or APV Gaulin, Inc.) twice at 4000-6000 psi. The homogenate is 
then mixed with NaCl solution to a final concentration of 0.5 M NaCl, followed by 
ccntrifugation at 7000 xg for 15 min. The resultant pellet is washed again using 0.5M 
NaCh 100 mM Tris, 50 mM EDTA, pH 7.4. 

The resulting washed inclusion bodies are solubilized with 1.5 M guanidine 
25 hydrochloride (GuHCl) for 2-4 hours. After 7000 xg centrifugation for 15 min., the 
pellet is discarded and the polypeptide containing supernatant is incubated at 4°C 
overnight to allow further GuHCl extraction. 

Following high speed centrifugation (30,000 xg) to remove insoluble particles, 
the GuHCl solubilized protein is refolded by quickly mixing the GuHCl extract with 20 
30 volumes of buffer containing 50 mM sodium, pH 4.5, 150 mM NaCl, 2 mM EDTA by 
vigorous stirring. The refolded diluted protein solution is kept at 4°C without mixing 
for 12 hours prior to further purification steps. 

To clarify the refolded polypeptide solution, a previously prepared tangential 
filtration unit equipped with 0.16 |im membrane filter with appropriate surface area 
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(e.g., Filtron), equilibrated with 40 mM sodium acetate, pH 6.0 is employed. The 
filtered sample is loaded onto a cation exchange resin (e.g., Poros HS-50, Pcrseptive 
Biosystems). The column is washed with 40 mM sodium acetate, pH 6.0 and eluted 
with 250 mM, 500 mM, 1000 mM, and 1500 mM NaCl in the same buffer, in a 
stepwise manner. The absorbance at 280 nm of the effluent is continuously monitored. 
Fractions are collected and further analyzed by SDS-PAGE. 

Fractions containing the polypeptide are then pooled and mixed with 4 volumes 
of water. The diluted sample is then loaded onto a previously prepared set of tandem 
columns of strong anion (Poros HQ-50, Perseptive Biosystems) and weak anion 
(Poros CM-20, Perseptive Biosystems) exchange resins. The columns are equilibrated 
with 40 mM sodium acetate, pH 6.0. Both columns are washed with 40 mM sodium 
acetate, pH 6.0, 200 mM NaCl. The CM-20 column is then eluted using a 10 column 
volume linear gradient ranging from 0.2 M NaCi, 50 mM sodium acetate, pH 6.0 to 1 .0 
M NaCl, 50 mM sodium acetate, pH 6.5. Fractions are collected under constant A, 80 
15 monitoring of the effluent. Fractions containing the polypeptide (determined, for 
instance, by 16% SDS-PAGE) are then pooled. 

The resultant polypeptide should exhibit greater than 95% purity after the above 
refolding and purification steps. No major contaminant bands should be observed from 
Commassie blue stained 16% SDS-PAGE gel when 5 ng of purified protein is loaded. 
20 The purified protein can also be tested for endotoxin/LPS contamination, and typically 
the LPS content is less than 0. 1 ng/ml according to LAL assays. 

Example 7: Cl oning and Expression of a Polypeptide in a Baeulovirus 
Expression System 

25 In lhis example, the plasmid shuttle vector pA2 is used to insert a polynucleotide 

into a baculovirus to express a polypeptide. This expression vector contains the strong 
polyhedrin promoter of the Autographa califonuca nuclear polyhidrosis virus 
(AcMNPV) followed by convenient restriction sites such as BamHI, Xba 1 and 
Asp718. The polyadenylation site of the simian virus 40 ("SV40") is used for efficient 

30 polyadenylation. For easy selection of recombinant virus, the plasmid contains the 

beta-galactosidase gene from E. coli under control of a weak Drosophila promoter in the 
same orientation, followed by the polyadenylation signal of the polvhcdrin scne. The 
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Many other baculovinis vectors can be used in place of the vector above, such 
as pAc373, pVL94L and pAcIMl, as one skilled in the an would readily appreciate, as 
long as the construct provides appropriately located signals for transcription, 
translation, secretion and the like, including a signal peptide and an in-frame AUG as 
required. Such vectors are described, for instance, in Luckow et al., Virology 170:31- 
39 (1989). 

Specifically, the cDNA sequence contained in the deposited clone, including the 
AUG initiation codon and the naturally associated leader sequence identified in Table 1, 
is amplified using the PCR protocol described in Example 1. If the naturally occurring 
signal sequence is used to produce the secreted protein, the pA2 vector does not need a 
second signal peptide. Alternatively, the vector can be modified (pA2 GP) to include a 
baculovirus leader sequence, using the standard methods described in Summers et ah, 
"A Manual of Methods for Baculovirus Vectors and Insect Cell Culture Procedures; 1 
Texas Agricultural Experimental Station Bulletin No. 1555 (1987). 
15 The amplified fragment is isolated from a 1% agarose gel using a commercially 

available kit ("Geneclean," BIO 101 Inc., La Jolla, Ca.). The fragment then is digested 
with appropriate restriction enzymes and again purified on a 1% agarose gel. 

The plasmid is digested with the corresponding restriction enzymes and 
optionally, can be dephosphorylated using calf intestinal phosphatase, using routine 
20 procedures known in the art. The DNA is then isolated from a 1% agarose gel using a 
commercially available kit ("Geneclean" BIO 101 Inc., La Jolla, Ca.). 

The fragment and the dephosphorylated plasmid are Iigated together with T4 
DNA ligase. E. coli HB101 or other suitable E. coli hosts such as XL-1 Blue 
(Stratagene Cloning Systems, La Jolla, CA) cells are transformed with the ligation 
25 mixture and spread on culture plates. Bacteria containing the plasmid are identified by 
digesting DNA from individual colonies and analyzing the digestion product by gel 
electrophoresis. The sequence of the cloned fragment is confirmed by DNA 
sequencing. 

Five (ig of a plasmid containing the polynucleotide is co-transfected with 1.0 jig 
30 of a commercially available linearized baculovirus DNA ("BaculoGold™ baculovirus 
DNA", Pharmingen, San Diego, CA), using the lipofection method described by 
Feigner et ah, Proc. Natl. Acad. Sci. USA 84:7413-7417 (1987). One |ig of 
BaculoGold™ virus DNA and 5 jag of the plasmid are mixed in a sterile well of a 
microtiter plate containing 50 \i\ of serum-free Grace's medium (Life Technologies 
35 Inc., Gaithersburg, MD). Afterwards, 10 ^1 Lipofectin plus 90 \i\ Grace's medium are 
added, mixed and incubated for 15 minutes at room temperature. Then the transfection 
mixture is added drop-wise to Sf9 insect cells (ATCC CRL 1711) seeded in a 35 mm 
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tissue culture plate with 1 ml Grace's medium without scrum. The plate is then 
incubated for 5 hours at 27° C. The transfection solution is then removed from the plate 
and 1 ml of Grace's insect medium supplemented with 10% fetal calf serum is added. 
Cultivation is then continued at 27° C for four days. 

After four days the supernatant is collected and a plaque assay is performed, as 
described by Summers and Smith, supra. An agarose gel with "Blue Gal" (Life 
Technologies Inc., Gaithersburg) is used to allow easy identification and isolation of 
gal-expressing clones, which produce blue-stained plaques. (A detailed description of a 
"plaque assay" of this type can also be found in the user's guide for insect cell culture 
and baculovirology distributed by Life Technologies Inc.. Gaithersburg, page 9-10.) 
After appropriate incubation, blue stained plaques are picked with the tip of a 
micropipettor (e.g., Eppendorf). The agar containing the recombinant viruses is then 
re-suspended in a microcentrifuge tube containing 200 ul of Grace's medium and the 
suspension containing the recombinant baculovirus is used to infect Sf9 cells seeded in 
35 mm dishes. Four days later the supernatants of these culture dishes are harvested 
and then they are stored at 4° C. 

To verify the expression of the polypeptide. Sf9 cells are grown in Grace's 
medium supplemented with 10% heat-inactivated FBS. The cells are infected with the 
recombinant baculovirus containing the polynucleotide at a multiplicity of infection 
("MOI") of about 2. If radiolabeled proteins are desired, 6 hours later the medium is 
removed and is replaced with SF900 II medium minus methionine and cysteine 
(available from Life Technologies Inc., Rockville, MD). After 42 hours. 5 uXT of 3 \S- 
methionine and 5 uCi 3 \S-cysteine (available from Amersham) are added. The cells are 
further incubated for 16 hours and then are harvested by centrifugation. The proteins 
in the supernatant as well as the intracellular proteins are analyzed by SDS-PAGK 
followed by autoradiography (if radiolabeled). 

Microsequencing of the amino acid sequence of the amino terminus of purified 
protein may be used to determine the amino terminal sequence of the produced 
protein. 



\ lypieai mammalian expression vector contains a promote, element, which mediates 
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the initiation of transcription of mRNA, a protein coding sequence, and signals required 
for the termination of transcription and polyadenylation of the transcript. Additional 
elements include enhancers, Kozak sequences and intervening sequences flanked by 
donor and acceptor sites for RNA splicing. Highly efficient transcription is achieved 
5 with the early and late promoters from SV40, the long terminal repeats (LTRs) from 
Retroviruses, e.g., RSV, HTLVI, HIVI and the early promoter of the cytomegalovirus 
(CMV). However, cellular elements can also be used (e.g., the human actin promoter). 

Suitable expression vectors for use in practicing the present invention include, 
for example, vectors such as pSVL and pMSG (Pharmacia, Uppsala. Sweden), 
10 pRSVcat (ATCC 37152), pSV2dhfr (ATCC 37146), pBCI2Ml (ATCC 67109), 
pCMVSport 2.0, and pCMVSport 3.0. Mammalian host cells that could be used 
include, human Hela, 293, H9 and Jurkat cells, mouse NIH3T3 and CI 27 cells, Cos 1. 
Cos 7 and CV 1, quail QC1-3 cells, mouse L cells and Chinese hamster ovary (CHO) 
cells. 

15 Alternatively, the polypeptide can be expressed in stable cell lines containing the 

polynucleotide integrated into a chromosome. The co-transf ection with a selectable 
marker such as dhfr, gpt, neomycin, hygromycin allows the identification and isolation 
of the transfected cells. 

The transfected gene can also be amplified to express large amounts of the 

20 encoded protein. The DHFR (dihydrofolate reductase) marker is useful in developing 
cell lines that carry several hundred or even several thousand copies of the gene of 
interest. (See, e.g.. Ait, F. W . et al., J. Biol. Chem. 253:1357-1370 (1978); Hamlin, 
J. L. and Ma, C, Biochem. et Biophys. Acta, 1097:107-143 (1990); Page, M. J. and 
Sydenham, M. A., Biotechnology 9:64-68 (1991).) Another useful selection marker is 

25 the enzyme glutamine synthase (GS) (Murphy et ah, Biochem J. 227:277-279 ( 1 99 1 ); 
Bebbington et al., Bio/Technology 10:169-175 (1992). Using these markers, the 
mammalian cells are grown in selective medium and the cells with the highest resistance 
are selected. These cell lines contain the amplified gene(s) integrated into a 
chromosome. Chinese hamster ovary (CHO) and NSO cells are often used for the 

30 production of proteins. 

Derivatives of the plasmid pSV2-dhfr (ATCC Accession No. 37146), the 
expression vectors pC4 (ATCC Accession No. 209646) and pC6 (ATCC Accession 
No.209647) contain the strong promoter (LTR) of the Rous Sarcoma Virus (Cullen et 
al.. Molecular and Cellular Biology, 438-447 (March, 1985 )) plus a fragment of the 

35 CMV-enhancer (Boshart et al.. Cell 41 :521-530 (1985).) Multiple cloning sites, e.g., 
with the restriction enzyme cleavage sites B ami II, Xbal and Asp718, facilitate the 
cloning of the gene of interest. The vectors also contain the 3' intron, the 
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polyadenylation and termination signal of the rat prcproinsulin gene, and the mouse 
DHFR gene under control of the SV40 early promoter. 

Specifically, the plasmid pC6, for example, is digested with appropriate 
restriction enzymes and then dephosphorylated using calf intestinal phosphates by 
procedures known in the art. The vector is then isolated from a 1% agarose gel. 

A polynucleotide of the present invention is amplified according to the protocol 
outlined in Example 1 If the naturally occurring signal sequence is used to produce the 
secreted protein, the vector does not need a second signal peptide. Alternatively, if the 
naturally occurring signal sequence is not used, the vector can he modified to include a 
heterologous signal sequence. (See, e.g., WO 96/34891.) 

The amplified fragment is isolated from a 1% agarose gel using a commercially 
available kit ( "Geneclean," BIO 101 Inc., La Jolla, Ca.). The fragment then is digested 
with appropriate restriction enzymes and again purified on a 1% agarose gel. 

The amplified fragment is then digested with the same restriction enzyme and 
purified on a 1% agarose gel. The isolated fragment and the dephosphorylated vector 
are then l.gated with T4 DNA ligase. E. coli HB101 or XL-1 Blue cells are then 
transformed and bacteria are identified that contain the fragment inserted into plasmid 
pC6 using, for instance, restriction enzyme analysis. 

Chinese hamster ovary cells lacking an active DHFR gene is used for 
transfection. Five iig of the expression plasmid pC6 is cotransfected with 0.5 jig of the 
plasmid pSVneo using lipofectin (Feigner et al.. supra). The plasmid pSV2-neo 
contains a dominant selectable marker, the neo gene from Tn5 encoding an enzyme that 
confers resistance to a group of antibiotics including G418. The cells are seeded in 
alpha minus MEM supplemented with 1 mg/ml G418. After 2 days, the cells are 
trypsinized and seeded in hybridoma cloning plates (Greiner, Germany) in alpha minus 
MEM supplemented with 10, 25, or 50 ng/ml of metothrexate plus 1 mg/ml G418. 
Alter about 10-14 days single clones are trypsinized and then seeded in 6-wcll petn 
dishes or 10 ml flasks using different concentrations of methotrexate (50 nM. 100 nM. 
200 nM. 400 nM, 800 nM). Clones growing at the highest concentrations of 
methotrexate are then transferred to new 6-well plates containing even higher 
concentrations of methotrexate ( 1 uM. 2 uM. 5 p.M, H) mM. 20 mM). The same 
procedure is repeated until clones are obtained which grow at a concentration of 100 - 
200 nM. Expression of the desired gene product is analyzed, for instance, by SDS- 
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Example 9: Protein Fusions 

The polypeptides of the present invention are preferably fused to other proteins. 
These fusion proteins ean be used for a variety of applications. For example, fusion of 
the present polypeptides to His-tag, HA-tag, protein A, IgG domains, and maltose 
5 binding protein facilitates purification. (See Example 5; see also EP A 394,827; 

Traunecker, et al.. Nature 33 1 :84-86 ( 1 988).) Similarly, fusion to IgG-l, IgG-3, and 
albumin increases the halflife time in vivo. Nuclear localization signals fused to the 
polypeptides of the present invention can target the protein to a specific subcellular 
localization, while covalent heterodimer or homodimers can increase or decrease the 

10 activity of a fusion protein. Fusion proteins can also create chimeric molecules having 
more than one function. Finally, fusion proteins can increase solubility and/or stability 
of the fused protein compared to the non-fused protein. All of the types of fusion 
proteins described above can be made by modifying the following protocol, which 
outlines the fusion of a polypeptide to an IgG molecule, or the protocol described in 

1 5 Example 5. 

Briefly, the human Fc portion of the IgG molecule can be PGR amplified, using 
primers that span the 5' and 3' ends of the sequence described below. These primers 
also should have convenient restriction enzyme sites that will facilitate cloning into an 
expression vector, preferably a mammalian expression vector. 

20 For example, if pC4 (Accession No. 209646) is used, the human Fc portion can 

be hgated into the BamHI cloning site. Note that the 3' BamHI site should be 
destroyed. Next, the vector containing the human Fc portion is re-restneted with 
BamHI, linearizing the vector, and a polynucleotide of the present invention, isolated 
by the PGR protocol described in Example 1, is ligated into this BamHI site. Note that 

25 the polynucleotide is cloned without a stop codon, otherwise a fusion protein will not 
be produced. 

If the naturally occurring signal sequence is used to produce the secreted 
protein, pG4 does not need a second signal peptide. Alternatively, if the naturally 
occurring signal sequence is not used, the vector can be modified to include a 
30 heterologous signal sequence. (See, e.g., WO 96/34891.) 

Human IgG Fc region: 

GGGATCCGGAGCCCAAATCTTCTGACAAAACTCACACATGGCCACCGTGCC 
CAGCACCTGAATTCGAGGGTGCAGCGTCAGTCTTCCTCTTCCCCCCAAAACC 
35 CAAGGACACCCTCATGATCTCCCGGACTCCTGAGGTCACATGCGTGGTGGT 
GGACGTAAGCCACGAAGACCCTGAGGTCAAGTTCAACTGGTACGTGGACG 
GCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTACAAC 
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AGCACGTACCGTGTGGTCAGCGTCCTCACCGTCCTGCACCAGGACTGGCTG 
AATGGCAAGGAGTACAAGTGCAAGGTCTCCAACAAAGCCCTCCCAACCCCC 
ATCGAGAAAACCATCTCCAAAGCCAAAGGGCAGCCCCGAGAACCACAGGT 
GTACACCCTGCCCCCATCCCGGGATGAGCTGACCAAGAACCAGGTCAGCCT 
5 GACCTGCCTGGTCAAAGGCTTCTATCCAAGCGACATCGCCGTGGAGTGGGA 
GAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTCCCGTGCTGG 
ACTCCGACGGCTCCTTCTTCCTCTACAGCAAGCrCACCGTGGACAAGAGCA 
GGTGGCAGCAGGGGAACGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGC 
ACA ACCAC T ACACGCAGAAGAGCCTCTCCC I GTCTCCGGGTA AATGAGTGC 
1 0 GACGGCCGCGACTCTAGAGGAT (SEQ ID NO: 1 ) 



20 



Example 10: Production of an Antibody from a Polypeptide 

The antibodies of the present invention can be piepared by a variety of methods. 
(See. Current Protocols, Chapter 2.) For example, cells expressing a polypeptide of 
1 5 the present invention is administered to an animal to induce the production of sera 

containing polyclonal antibodies. In a preferred method, a preparation of the secreted 
protein is prepared and purified to render it substantially free of natural contaminants. 
Such a preparation is then introduced into an animal in order to produce polyclonal 
antisera of greater specific activity. 

In the most preferred method, the antibodies of the present invention are 
monoclonal antibodies (or protein binding fragments thereof). Such monoclonal 
antibodies can be prepared using hybridoma technology. (Kohler et ah, Nature 
256:495 (1975); Kohler et al., Eur. J. Immunol. 6:51 1 (1976); Kohler et al., Eur. J. 
Immunol. 6:292 (1976); Hammerling et ah, in: Monoclonal Antibodies and T-Cell 
25 Hybndomas, Elsevier, N.Y., pp. 563-681 (1981).) In general, such procedures 
involve immunizing an animal (preferably a mouse) with polypeptide or, more 
preferably, with a secreted polypeptide-expressing cell. Such cells may be cultured in 
any suitable tissue culture medium; however, it is preferable to culture cells in Fades 
modified Eagle's medium supplemented with 10% fetal bovine serum (inactivated at 
30 about 56°C), and supplemented with about 10 g/1 of nonessential amino acids, about 
1,000 U/ml of penicillin, and about 100 |ig/ml of streptomycin. 

The splenocytes of such mice are extracted and fused with a suitable myeloma 
cell line. Any suitable myeloma cell line may be employed in accordance with the 
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described by Wands ct al. (Gastroenterology 80:225-232 ( 198 ] ).) The hybndoma cells 
obtained through such a selection are then assayed to identify clones which secrete 
antibodies capable of binding the polypeptide 

Alternatively, additional antibodies capable of binding to the polypeptide can be 
5 produced in a two-step procedure using anti-idiotypic antibodies. Such a method 
makes use of the fact that antibodies are themselves antigens, and therefore, it is 
possible to obtain an antibody which binds to a second antibody. In accordance with 
this method, protein specific antibodies are used to immunize an animal, preferably a 
mouse. The splenocytes of such an animal are then used to produce hybridoma cells, 
1 0 and the hybridoma cells are screened to identify clones which produce an antibody 

whose ability to bind to the protein-specific antibody can be blocked by the polypeptide. 
Such antibodies comprise anti-idiotypic antibodies to the protein-specific antibody and 
can be used to immunize an animal to induce formation of further protein-specific 
antibodies. 

15 !t wili be appreciated that Fab and F(ab , )2 and other fragments of the antibodies 

of the present invention may be used according to the methods disclosed herein. Such 
fragments are typically produced by proteolytic cleavage, using enzymes such as papain 
(to produce Fab fragments) or pepsin (to produce F(ab')2 fragments). Alternatively, 
secreted protein-binding fragments can be produced through the application of 

20 recombinant DNA technology or through synthetic chemistry. 

For in vivo use of antibodies in humans, it may be preferable to use 
"humanized" chimeric monoclonal antibodies. Such antibodies can be produced using 
genetic constructs derived from hybridoma cells producing the monoclonal antibodies 
described above. Methods for producing chimeric antibodies are known in the art. 

25 (See, for review, Morrison, Science 229: 1202 (1985); Oi et al., BioTechmques 4:214 
(1986); Cabilly et al., U.S. Patent No. 4,816,567; Taniguchi et aL, EP 171496; 
Morrison et al., EP 173494; Neuberger et al., WO 8601533; Robinson et al., WO 
8702671; Bouhanne et al.. Nature 3 12:643 (1984); Neuberger et al.. Nature 314:268 
(1985).) 

30 

Example 11: Production Of Secreted Protein For High-Throug hput 

Screening Assays 

The following protocol produces a supernatant containing a polypeptide to be 
tested. This supernatant can then be used in the Screening Assays described in 
35 Examples 13-20. 

First, dilute Poly-D-Lysine (644 587 Boehringer-Mannheim) stock solution 
(Img/ml in PBS) 1:20 in PBS (w/o calcium or magnesium 17-5 1 6F Biowhittaker) for a 
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working solution of 50ug/ml. Add 200 ul of this solution to each well (24 well plates) 
and incubate at RT for 20 minutes. Be sure to distribute the solution over each well 
(note: a 12-channeI pipetter may be used with tips on every other channel). Aspirate off 
the Poly-D-Lysine solution and rinse with 1ml PBS (Phosphate Buffered Saline). The 
5 PBS should remain in the well until just prior to plating the cells and plates may be 
poly-lysine coated in advance for up to two weeks. 

Plate 293T cells (do not carry cells past P+20) at 2 x 10 s cells/well in .5ml 
DMEM(Dulbecco's Modified Eagle Medium )(with 4.5 G/L glucose and L-glutaminc 
( 12-604F Biowhittakcr))/10% heat inactivated FBS( 14-503F BiowhittakerVlx 

10 Penstrep(17-602E Biowhittaker). Let the cells grow overnight. 

The next day, mix together in a sterile solution basin; 300 ul Lipofectamine 
(18324-012 Gibco/BRL) and 5ml Optimem I (31985070 Gibco/BRL)/96-weIl plate. 
With a smaii voiume multi-channei pipetter, aliquot approximately 2ug ot an expression 
vector containing a polynucleotide insert, produced by the methods described in 

15 Examples 8 or 9. into an appropriately labeled 96- well round bottom plate. With a 

multi-channel pipetter, add 50ul of the Lipofectamine/Optimem I mixture to each well. 
Pipette up and down gently to mix. Incubate at RT 15-45 minutes. After about 20 
minutes, use a multi-channel pipetter to add 1 50ul Optimem I to each well. As a 
control, one plate of vector DNA lacking an insert should be transfected with each set of 

20 transfections. 

Preferably, the transfection should be performed by tag-teaming the following 
tasks. By tag-teaming, hands on time is cut in half, and the cells do not spend too 
much time on PBS. First, person A aspirates off the media from four 24-well plates of 
cells, and then person B rinses each well with .5- lml PBS. Person A then aspirates off 

25 PBS rinse, and person B, using al2-channel pipetter with tips on every other channel, 
adds the 200ul of DNA/Lipofectamine/Optimem I complex to the odd wells First, then to 
the even wells, to each row on the 24-well plates. Incubate at 37°C for 6 hours. 

While cells are incubating, prepare appropriate media, either 19cBSA in DMEM 
with ix penstrep, or CHO-5 media (1 16.6 mg/L of CaC12 (anhyd); 0.00130 mg/L 
30 CuS0 4 -5H,0; 0.050 mg/L of Fe(NO/K-9HX); 0.417 mg/L of FeS0 4 -7H : 0; 31 1.80 
mg/L of Kcl; 28.64 mg/L of MgCL; 48.84 mg/L of MgS0 4 ; 6995.50 mg/L of NaCl; 
2400.0 mg/L of NaHCXL; 62.50 mg/L of NaILPO r IL0; 71.02 mg/L of Na,HP04; 
.4320 mg/L of ZnSO r 7fLO; .002 mg/L of Arachidonic Acid ; 1.022 mg/L of 
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Pluronic F-68; 0.0 1 0 rng/L of Stearic Acid; 2.20 mg/L of Twecn 80; 455 1 mg/L of D- 
Glucose; 130.85 mg/ml of L- Alanine; 147.50 mg/ml of L-Arginine-HCL; 7.50 mc/ml 
of L-Asparagine-H,0; 6.65 mg/ml of L-Aspanic Acid; 29.56 mg/ml of L-Cystine- 
2HCL-H,0; 31.29 mg/ml of L-Cystine-2HCL; 7.35 mg/ml of L-Glutamic Acid; 365.0 
mg/ml of L-Glutamine; 18.75 mg/ml of Glycine; 52.48 mg/ml of L-f iistidine-HCL- 
H 2 0; 106.97 mg/ml of L-Isoleucme; 1 1 1.45 mg/ml of L- Leucine: 163.75 mg/ml of L- 
Lysine HCL; 32.34 mg/ml of L-Mcthionine; 68.48 mg/ml of L-Phcnylalainine; 40.0 
mg/ml of L-Proline; 26.25 mg/ml of L-Serine; 101.05 mg/ml of L-Threonme; 19.22 
mg/ml of L-Tryptophan; 91.79 mg/ml of L-Tryrosine-2Na-21 L0; 99.65 mg/ml of L- 
Valine; 0.0035 mg/L of Biotin; 3.24 mg/L of D-Ca Pantothenate; 1 1 .78 mg/L of 
Choline Chloride; 4.65 mg/L of Folic Acid; 15.60 mg/L of i-Inositol; 3.02 mg/L of 
Niacinamide; 3.00 mg/L of Pyridoxal HCL; 0.03 1 mg/L of Pyridoxine HCL; 0.3 19 
mg/L of Riboflavin; 3.17 mg/L of Thiamine HCL; 0.365 mg/L of Thymidine; and 
0.680 mg/L of Vitamin B ]: ; 25 mM of HEPES Buffer; 2.39 mg/L of Na Hypoxanthme; 
15 0.105 mg/L of Lipoic Acid; 0.081 mg/L of Sodium Putrescine-2HCL; 55.0 mg/L of 
Sodium Pyruvate; 0.0067 mg/L of Sodium Selenite; 20uM of Ethanoiamine; 0. 122 
mg/L of Ferric Citrate; 41.70 mg/L of Methyl-B-Cyclodextrin complexed with Linoleic 
Acid; 33.33 mg/L of Methyl-B-Cyclodextrin complexed with Oleic Acid; and 10 mg/L 
of Methyl-B-Cyclodextrin complexed with Retinal) with 2mm glutamine and lx 
penstrep. (BSA (81-068-3 Bayer) lOOgm dissolved in 1 L DM EM for a 10% BSA stock 
solution ). Filter the media and collect 50 ul for endotoxin assay in 15ml polystyrene 
conical. 

The transfection reaction is terminated, preferably by tag-teaming, at the end of 
the incubation period. Person A aspirates off the transfection media, while person B 
adds 1.5ml appropriate media to each well. Incubate at 37°C for 45 or 72 hours 
depending on the media used: 1 %BS A for 45 hours or CHO-5 for 72 hours. 

On day four, using a 300ul multichannel pipetter, aliquot 600ul in one 1ml deep 
well plate and the remaining supernatant into a 2ml deep well. The supernatants from 
each well can then be used in the assays described in Examples 13-20. 

It is specifically understood that when activity is obtained in any of the assays 
described below using a supernatant, the activity originates from either the polypeptide 
directly (e.g.. as a secreted protein) or by the polypeptide inducing expression of other 
proteins, which are then secreted into the supernatant. Thus, the invention further 
provides a method of identifying the protein in the supernatant characterized by an 
35 activity in a particular assay. 
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Example 12: Construction of (IAS Reporter Construct 

One signal transduction pathway involved in the differentiation and proliferation 
of cells is called the Jaks-STATs pathway. Activated proteins in the Jaks-STATs 
pathway bind to gamma activation site "GAS" elements or interferomsensitive 
5 responsive element ("ISRE"), located in the promoter of many genes. The binding of a 
protein to these elements alter the expression of the associated gene. 

GAS and ISRE elements are recognized by a class of transcription factors called 
Signal Transducers and Activators of Transcription, or "STATs." T here are six 
members of the STATs family. Stat 1 and Stat3 are present in many cell types, as is 

10 Stat2 (as response to IFN-alpha is widespread). Stat4 is more restricted and is not in 
many cell types though it has been found in T helper class 1, cells after treatment with 
IL- 12. Stai5 was originally called mammary growth factor, but has been found at 
higher concentrations in other ceils including myeloid ceils. It can be activated in tissue 
culture cells by many cytokines. 

15 The STATs are activated to translocate from the cytoplasm to the nucleus upon 

tyrosine phosphorylation by a set of kinases known as the Janus Kinase ("Jaks") 
family. Jaks represent a distinct family of soluble tyrosine kinases and include Tyk2, 
Jakl , Jak2, and Jak3. These kinases display significant sequence similarity and are 
generally catalytically inactive in resting cells. 

20 The Jaks are activated by a wide range of receptors summarized in the Table 

below. (Adapted from review by Schidler and Darnell, Ann. Rev. Biochem. 64:621-51 
(1995).) A cytokine receptor family, capable of activating Jaks, is divided into two 
groups; (a) Class 1 includes receptors for IL-2, IL-3, IL-4, IL-6, IL-7, IL-9, IL- 1 1, IL- 
12, IL-15, Epo, PRL, GH, G-CSF, GM-CSF, LIF, CNTF, and thrombopoietin; and 

25 (b) Class 2 includes IFN-a, IFN-g, and IL-10. The Class 1 receptors share a 

conserved cysteine motif (a set of four conserved cysteines and one tryptophan) and a 
WSXWS motif (a membrane proxial region encoding Trp-Ser-Xxx-Ti p-Scr (SEQ ID 
NO:2)). 

Thus, on binding of a ligand to a receptor, Jaks are activated, which in turn 
30 activate STATs, which then translocate and bind to GAS elements. This entire process 
is encompassed in the Jaks-STATs signal transduction pathway. 

Therefore, activation of the Jaks-STATs pathway, reflected by the binding of 
the GAS or the ISRE element, can be used to indicate proteins involved in the 
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Ligand 

IFN family 
IFN-a/B 
IFN-g 
11-10 

gp!3Q family 
IL-6 (Pleiotrohic) 
II- 1 1 (Pleiotrohic) 
OnM(Pleiotrohic) 
LIF(Pleiotrohic) 
CNTF(Pleiotrohic) 
G-CSF(Pleiotrohic) 
IL-12(Pleiotrohic) 

g-C family 
IL-2 (lymphocytes) 
1L-4 (lymph/myeloid) 
IL-7 (lymphocytes) 
IL-9 (lymphocytes) 
JL-13 (lymphocyte) 
IL-15 

g p!4() family 
IL-3 (myeloid) 
IL-5 (myeloid) 
GM-CSF (myeloid) - 

Growth hormone family 
GH <> 
PRL ■> 
EPO > 



Receptor Tyrosine Kinases 
EGF -> 
PDGF 9 
CSF-1 9 



JAKs 



tyk2 


Jakl 


Jak2 


Jak3 




+ 


+ 


_ 


_ 


1,2, 




+ 


+ 




1 


+ 


9 




- 


1,3 


+ 




+ 




1,3 




+ 


9 




1,3 




+ 


+ 




1,3 




+ 


+ 




1,3 


-/+ 


+ 


+ 




1,3 




+ 


? 




1.3 


+ 




+ 


+ 


1.3 




+ 






1,3, 




+ 




+ 


6 




+ 




+ 


5 




+ 




+ 


5 




+ 


'> 




6 


? 


+ 


'} 


+ 


5 



+/- 



+ 



+ 
+ 
+ 



+ 
+ 
+ 



+ 
+ 
+ 



.1 
5 
5 



5 

1,3,5 

5 



GAS(elcmcnts) or ISRF 



ISRE 

GAS (IRFl>Lys6>IFP) 



GAS (IRFl>Lvs6>IFP) 



GAS 

GAS (IRFI = IFP»Ly6)(IgH> 

GAS 

GAS 

GAS 

GAS 



GAS (IRF!>IFP»Lv6) 

GAS 

GAS 



GAS(B-CAS>IRFI=IFP»Lv6) 



1.3 
1.3 
1,3 



GAS (IRFI) 
GAS (not IRFI) 
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I o construct a synthetic GAS containing promoter clement, which is used in the 
Biological Assays described in Examples 13-14, a PCR based strategy is employed to 
generate a GAS-SV40 promoter sequence. The 5' primer contains four tandem copies 
of the GAS binding she found in the IRFI promoter and previously demonstrated to 
5 bind STATs upon induction with a range of cytokines (Rothman et aL, Immunity 

1:457-468 (1994).), although other GAS or ISRE elements can be used instead. The 5' 
primer also contains 1 8bp of sequence complementary to the S V40 early promoter 
sequence and is flanked with an Xhol site. The sequence of the 5' primer is: 
S\GCGCCTCGAGATT^ 
10 AAATGATrTCCCCGAAATATCTGCCATCTCAATl V\G:.V (SEQ ID NO:3) 

The downstream primer is complementary to the SV40 promoter and is flanked 
with a Hind III site: 5 , :GCGGCAAGCTTTTTGCAAAGCCTAGGC:3 , (SEQ ID 

PCR amplification is performed using the SV40 promoter template present in 
15 the B-gal:promoter plasmid obtained from Clontech. The resulting PCR fragment is 
digested with Xhol/Hind III and subcloned into BLSK2-. (Stratagene.) Sequencing 
with forward and reverse primers confirms that the insert contains the following 
sequence: 

5 CTCGAGAT^ 

20 ATTTCCCCGAAATATCTGCCATCTCAATTAGTCAGCAACCATAGTCCCGCCC 

CTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCA1TCTCCGC 

CCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGC 

CTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTT 
TGC AA A AAGCTT : 3 T (SEQ ID NO:5) 

25 With this GAS promoter element linked to the SV40 promoter, a GAS:SEAP2 

reporter construct is next engineered. Here, the reporter molecule is a secreted alkaline 
phosphatase, or "SEAR/ 1 Clearly, however, any reporter molecule can be instead of 
SEAP, in this or in any of the other Examples. Well known reporter molecules that can 
be used instead of SEAP include chloramphenicol acetyltransferase (CAT), luciferase, 

30 alkaline phosphatase, B-galactosidase, green fluorescent protein (GPP), or any protein 
detectable by an antibody. 

The above sequence confirmed synthetic GAS SV40 promoter element is 
subcloned into the pSEAP-Promoter vector obtained from Clontech using Hindlll and 

VI-. I • f .k . . . , ! ;.,..} c\'</, . . , , . : , ; • . , ■ . : , . 
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Thus, in order to generate mammalian stable cell lines expressing the GAS- 
SEAP reporter, the GAS-SEAP cassette is removed from the GAS-SEAP vector usins 
Sail and NotI, and inserted into a backbone vector containing the neomycin resistance 
gene, such as pGFP-1 (Clontech), using these restriction sites in the multiple cloning 
5 site, to create the GAS-SEAP/Neo vector. Once this vector is transfected into 

mammalian cells, this vector can then be used as a reporter molecule for GAS binding 
as described in Examples 13-14. 

Other constructs can be made using the above description and replacing GAS 
with a different promoter sequence. For example, construction of reporter molecules 
10 containing NFK-B and EGR promoter sequences are described in Examples 15 and 16. 
However, many other promoters can be substituted using the protocols described in 
these Examples. For instance, SRE, IL-2, NFAT, or Osteocalcin promoters can be 
substituted, alone or in combination (e.g., GAS/NF-KB/EGR, GAS/NF-KB, II- 
2/NFAT, or NF-KB/GAS). Similarly, other cell lines can be used to test reporter 
15 construct activity, such as HELA (epithelial), HUVEC (endothelial), Reh (B-cell), 
Saos-2 (osteoblast), HUVAC (aortic), or Cardiomyocyte. 

Example 13: High-Throughput Screening Assay for T-cell Activity. 

The following protocol is used to assess T-cell activity by identifying factors, 

20 such as growth factors and cytokines, that may proliferate or differentiate T-cells. T- 
cell activity is assessed using the GAS/SEAP/Neo construct produced in Example 12. 
Thus, factors that increase SEAP activity indicate the ability to activate the Jaks-STATS 
signal transduction pathway. The T-cell used in this assay is Jurkat T-cells (ATCC 
Accession No. T1B-152), although Molt-3 cells (ATCC Accession No. CRL-I552) and 

25 Molt-4 cells (ATCC Accession No. CRL- 1 582) cells can also be used. 

Jurkat T-cells are lymphoblastic CD4+ Thl helper cells. In order to generate 
stable cell lines, approximately 2 million Jurkat cells are transfected with the GAS- 
SEAP/neo vector using DMRIE-C (Life Technologies)(transfection procedure 
described below). The transfected cells are seeded to a density of approximately 

30 20,000 cells per well and transfectants resistant to 1 mg/ml genticin selected. Resistant 
colonies are expanded and then tested for their response to increasing concentrations of 
interferon gamma. The dose response of a selected clone is demonstrated. 

Specifically, the following protocol will yield sufficient cells for 75 wells 
containing 200 ul of cells. Thus, it is either scaled up, or performed in multiple to 

35 generate sufficient cells for multiple 96 well plates. Jurkat cells are maintained in RPMI 
+ 107c serum with l%Pen-Strcp. Combine 2.5 mis of OPTI-MEM (Fife Technologies) 
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with 10 ug of plasmid DNA in a T25 ilask. Add 2.5 ml OPTI-MEM containing 50 ul 
ot DMRIE-C and incubate at room temperature tor 15-45 mins. 

During the incubation period, count cell concentration, spin down the required 
number of cells ( 10 7 per transfection), and resuspend in OPTI-MEM to a final 
concentration of 10 7 cells/ml. Then add 1ml of 1 x 10 7 cells in OPTI-MEM to T25 flask 
and incubate at 37°C for 6 hrs. After the incubation, add 10 ml of RPMI + 15% serum. 

The Jurkat:GAS-SEAP stable reporter lines are maintained in RPMI + 10% 
serum, 1 mg/ml Gentian, and 1% Pen-Strep. These cells are treated with supernatants 
containing a polypeptide as produced by the protocol described in Example 1 1 . 

On the day of treatment with the supernatant, the cells should be washed and 
resuspended in fresh RPMI + 10% serum to a density of 500,000 cells per ml. The 
exact number of cells required will depend on the number of supernatants being 
screened. I or one 96 well plate, approximately iu million ceils (for 10 plates, 100 
million cells) are required. 
15 Transfer the cells to a triangular reservoir boat, in order to dispense the cells into 

a 96 well dish, using a 12 channel pipette. Using a 12 channel pipette, transfer 200 ul 
of cells into each well (therefore adding 100, 000 cells per well). 

After all the plates have been seeded, 50 ul of the supernatants are transferred 
directly from the 96 well plate containing the supernatants into each well using a 12 
20 channel pipette. In addition, a dose of exogenous interferon gamma (0.1, 1.0, 10 nc) 
is added to wells H9. H10. and 111 1 to serve as additional positive controls for the 
assay. 

The 96 well dishes containing Jurkat cells treated with supernatants are placed in 
an incubator for 48 hrs (note: this time is variable between 48-72 hrs). 35 ul samples 
25 from each well are then transferred to an opaque 96 well plate using a 12 channel 

pipette. The opaque plates should be covered (using scllophene covers) and stored at - 
20°C until SEAP assays are performed according to Example 17. The plates 
containing the remaining treated cells are placed at 4°C and serve as a source of material 
for repeating the assay on a specific well if desired. 

As a positive control, 100 Unit/ml interferon gamma can be used which is 
known to activate Jurkat T cells. Over 30 fold induction is typically observed in the 
positive control wells. 
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Example 14 : High-Throughput Screening Assay Identifying Myeloid 
Activity 

The following protocol is used to assess myeloid activity by identifying factors, 
such as growth factors and cytokines, that may proliferate or differentiate myeloid cells, 
5 Myeloid cell activity is assessed using the GAS/SEAP/Neo construct produced in 

Example 12. Thus, factors that increase SEAP activity indicate the ability to activate the 
Jaks-STATS signal transduction pathway. The myeloid ceil used in this assay is U937, 
a pre-monocyte cell line, although TF- 1, HL60, or KG 1 can be used. 

To transiently transfect U937 cells with the GAS/SEAP/Neo construct produced 
10 in Example 12, a DEAE-Dextran method (Kharbanda et. al., 1994, Cell Growth & 
Differentiation, 5:259-265) is used. First, harvest 2xl()e 7 U937 cells and wash with 
PBS. The U937 cells are usually grown in RPMI 1640 medium containing 10% heat- 
inactivated fetal bovine serum (FRS) supplemented with 100 units/ml penicillin and 100 
mg/ml streptomycin. 

15 Next ^ suspend the cells in 1 ml of 20 mM Tris-HCl (pH 7.4) buffer containing 

0.5 mg/ml DEAE-Dextran, 8 ug GAS-SEAP2 plasmid DNA, 140 mM NaCl, 5 mM 
KCh 375 uM Na 2 HP0 4 .7H 2 0, 1 mM MgCb, and 675 uM CaCb. Incubate at 37«c 
for 45 min. 

Wash the cells with RPMI 1640 medium containing 10% FBS and then 
20 resuspend in 10 ml complete medium and incubate at 37°C for 36 hr. 

The GAS-SEAP/U937 stable cells are obtained by growing the cells in 400 
ug/ml G418. The G418-free medium is used for routine growth but every one to two 
months, the cells should be re-grown in 400 ug/ml G4 1 8 for couple of passages. 

These cells are tested by harvesting lxl 0 S cells (this is enough for ten 96- well 
25 plates assay) and wash with PBS. Suspend the cells in 200 ml above described growth 
medium, with a final density of 5x10 s cells/ml. Plate 200 ul cells per well in the 96- 
well plate (or 1x10 s cells/well). 

Add 50 ul of the supernatant prepared by the protocol described in Example 11. 
Incubate at 37°C for 48 to 72 hr. As a positive control, 100 Unit/ml interferon gamma 
30 can be used which is known to activate U937 cells. Over 30 fold induction is typically 
observed in the positive control wells. SEAP assay the supernatant according to the 
protocol described in Example 17. 
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Example 15: High-Throughput Sc r eening Assa y Identifyi ng Neuronal 
Activity. 

When cells undergo differentiation and proliferation, a group of genes are 
activated through many different signal transduction pathways. One of these genes, 
EGR1 (early growth response gene 1), is induced in various tissues and cell types upon 
activation. The promoter of EGR 1 is responsible for such induction. Using the EGR1 
promoter linked to reporter molecules, activation of cells can be assessed. 

Particularly, the following protocol is used to assess neuronal activity in PC12 
cell lines. PC 12 cells (rat phenochromocytoma cells) are known to proliferate and/or 
differentiate by activation with a number of mitogens, such as TPA (tetradecanoyl 
phorbol acetate). NGF (nerve growth factor), and EGF (epidermal growth factor). The 
EGR 1 gene expression is activated during this treatment. Thus, by stably transfecting 
PC 12 cells with a construct containing an EGR promoter linked to SEAP reporter, 
activation of PCI 2 cells can be assessed. 

The EGR/SEAP 

- - r „^ — ^ uo^jiiuj^u uy me iuuuwjng proiocoi. 

The EGR-1 promoter sequence (-633 to +1 ((Sakamoto K et al., Oncogene 6:867-871 
(1991)) can be PCR amplified from human genomic DNA using the following primers: 
5' GCGCTCGAGGGATGACAGCGATAGAACCCCGG -3' (SEQIDNO:6) 
5' GCGAAGOTCGCGACTCCCCGGATCCGCCTC-3' (SEQIDNO:7) 
Using the G AS:SEAP/Neo vector produced in Example 1 2, EGR 1 amplified 
product can then be inserted into this vector. Linearize the GAS:SEAP/Neo vector 
using restriction enzymes Xhol/Hindlll, removing the GAS/SV40 stuffer. Restrict the 
EGR1 amplified product with these same enzymes. Ligate the vector and the EGR1 
promoter. 

To prepare 96 well-plates for cell culture, two mis of a coating solution (1:30 
dilution of collagen type I (Upstate Biotech Inc. Cat#08-1 15) in 30% ethanol (filter 
sterilized)) is added per one 10 cm plate or 50 ml per well of the 96-well plate, and 
allowed to air dry for 2 hr. 

PC 12 cells are routinely grown in RPMI- 1640 medium (Bio Whittaker) 
containing 10% horse serum (JRH BIOSCIENCES, Cat. # 12449-78P), 5% heat- 
inactivated fetal bovine serum (FBS) supplemented with 100 units/ml penicillin and 100 
ug/ml streptomycin on a precoated 1 0 cm tissue culture dish. One to four split is done 
every three to four days. Cells are removed from the plates by scraping and 
resuspended with pipetting up and down for more than 15 times. 

Transfect the EGR/SEAP/Neo construct into PC 1 2 using the Lipofectamme 
protocol described in Example II EGR-SFAP/PCP stable cc]h rm- nht.-ti'rvvfbv 
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growth but every one to two months, the cells should be re-grown i„ 300 ug/ml G4 1 8 
for couple of passages. 

To assay for neuronal activity, a 1 0 cm plate with cells around 70 to 80% 
confluent ,s screened by removing the old medium. Wash the cells once with PBS 
(Phosphate buffered saline). Then starve the cells in low serum medium (RPMI-1640 
containing 1 % horse serum and 0.5% FBS with antibiotics) overnight. 

The next morning, remove the medium and wash the cells with PBS Scrape 
off the cells from the plate, suspend the cells well i„ 2 ml low serum medium. Count 

the cell number and add more low serum medium to reach final cell density as 5x1 0 5 
cells/ml. 

Add 200 ul of the cell suspension to each well of 96-well plate (equivalent to 
1 x 105 cens/wei,). Add 50 ul supernatant produced by Example 1 1 , 3 7 o C for 48 to 72 
hr. As a positive control, a growth factor known to activate PC12 cells throuoh EGR 
can be used, such as 50 ng/ul of Neuronal Growth Factor (NGF). Over fifty-fold 
induction of SEAP is typically seen in the positive control wells. SEAP assay the 
supernatant according to Example 17. 

NF-kB (Nuclear Factor kB) ,s a transcription factor activated by a wide variety 
of agents including the inflammatory cytokines IL-1 and TNF, CD30 and CD40 
lymphotoxin-alpha and lymphotoxin-beta, by exposure to LPS or thrombin, and by 
expression of certain viral gene products. As a transcription factor, NF-kB regulates 
.the expression of genes involved in immune cell activation, control of apoptosis (NF- 
kB appears to shield cells from apoptosis), B and T-cell development, ami-viral and 
antimicrobial responses, and multiple stress responses. 

In non-stimulated conditions, NF- kB is retained in the cytoplasm with I-kB 

(Inhibitor KB). However, upon stimulation, I- kB is phosphorylated and degraded, 

causmg NF- kB to shuttle to the nucleus, thereby activating transcription of target 

genes. Target genes activated by NF- kB include IL-2, IL-6, GM-CSF, ICAM-1 and 
class 1 MHC. 

Due to its central role and ability to respond to a range of stimuli, reporter 
constructs utilizing the NF-kB promoter element are used to screen the supematants 
produced in Example 1 1 Activators or inhibitors of NF-kB would be useful in treating 
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diseases. For example, inhibitors of NF-kB could be used to treat those diseases 
related to the acute or chronic activation of NF-kE3, such as rheumatoid arthritis. 

To construct a vector containing the NF-kB promoter element, a PCR based 

strategy is employed. The upstream primer contains four tandem copies of the NF-kB 

5 binding site (GGGGACTTTCCC) (SEQ ID NO:8), 18 bp of sequence complementary 
to the 5' end of the SV40 early promoter sequence, and is flanked with an Xhol site: 

5:GCGGCCTCGAGGGGACTTTCCCGGGGACTTTCCGGGGACTTTCCGGGAC 

TTTCCATCCTGCCATCTCAATTAG:3' (SEQ ID NO:9) 

The downstream primer is complementary to the 3' end of the SV40 promoter 
10 and is flanked with a Hind III site: 

5\GCGGCAAGCTTTTTGCAAAGCCTAGGC:3' (SEQ ID NO:4) 

PCR amplification is performed using the SV40 promoter template present in 

the pB-gal:promoter plasmid obtained from Clontech. The resulting PCR fragment is 

digested with Xhol and Hind III and subcloned into BLSK2-. (Stratagene) 
15 Sequencing with the T7 and T3 primers confirms the insert contains the following 

sequence: 

5:CTCGAGGGGACTTTCCCGGGGACTTTCCGGGGACTTTCCGGGACTTTCC 
ATCTGCCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCA 
20 TCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACT 
AATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTC 
CAGAAGTAGTGAGGAGGCTTTTT1 GGAGGCCTAGGCTTTTGCAAAAAGCTT: 
3' (SEQ ID NO: 10) 

Next, replace the S V40 minimal promoter element present in the pSEAP2- 

promoter plasmid (Clontech) with this NF-KB/SV40 fragment using Xhol and Hindlll. 

However, this vector does not contain a neomycin resistance gene, and therefore, is not 
preferred for mammalian expression systems. 

In order to generate stable mammalian cell lines, the NF-KB/SV40/SEAP 

cassette is removed from the above NF-kB/SEAP vector using restriction enzymes Sail 
and NotI, and inserted into a vector containing neomvein reliance Partieularlv. the 
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Once NE-kB/SV40/SEAP/Neo vector is created, stable Jurkat T-cells are 
created and maintained according to the protocol described in Example 13. Similarly, 
the method for assaying supernatants with these stable Jurkat T-cells is also described 
in Example 13. As a positive control, exogenous TNF alpha (0.1,1, 10 ng) is added to 
5 wells H9, HIO. and HI 1, with a 5-10 fold activation typically observed. 

Kxample 17: Assay for SEAP Activity 

As a reporter molecule for the assays described in Examples 13-16, SEAP 
activity is assayed using the Tropix Phospho-light Kit (Cat. BP-400) according to the 
10 following general procedure. The Tropix Phospho-light Kit supplies the Dilution, 
Assay, and Reaction Buffers used below. 

Prime a dispenser with the 2.5x Dilution Buffer and dispense 15 (il of 2.5x 
dilution buffer into Optiplates containing 35 |al of a supernatant. Seal the plates with a 

plastic sealer and incubate at 65°C for 30 min. Separate the Optiplates to avoid uneven 
15 heating. 

Cool the samples to room temperature for 15 minutes. Empty the dispenser and 
prime with the Assay Buffer. Add 50 |il Assay Buffer and incubate at room 
temperature 5 min. Empty the dispenser and prime with the Reaction Buffer (see the 
table below). Add 50 yi\ Reaction Buffer and incubate at room temperature for 20 
20 minutes. Since the intensity of the chemiluminescent signal is time dependent, and it 

takes about 10 minutes to read 5 plates on luminometer. one should treat 5 plates at each 
time and start the second set 10 minutes later. 

Read the relative light unit in the luminometer. Set H12 as blank, and print the 
results. An increase in chemiluminescence indicates reporter activity. 

25 

Reaction Buffer Formulation: 

# of plates Rxn buffer diluent (ml) CSPD (ml) 
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Example 18: High-Throughput Screening Assay Identifying Changes in 
Small Molecule Concentration and Membrane Permeahility 

Binding of a ligand to a receptor is known to alter intracellular levels of small 
5 molecules, such as calcium, potassium, sodium, and pH, as well as alter membrane 
potential. These alterations can be measured in an assay to identify supernatants which 
bind to receptors of a particular cell. Although the following protocol describes an 
assay for calcium, this protocol can easily be modified to detect changes in potassium, 
sodium, pH, membrane potential, or any other small molecule which is detectable by a 
10 fluorescent probe. 

The following assay uses Fluorometnc Imaging Plate Reader Ci LIPR") to 
measure changes in fluorescent molecules (Molecular Probes) that bind small 
molecules. Clearly, any fluorescent molecule detecting a small molecule can be used 
instead of the calcium fluorescent molecule, fluo-3. used here. 
I s For adherent cells, seed the cells at 10 000 - 20 000 crtls/well in a Cn u;>r M :il *l- 



( 1 lank n Balanced Salt Solution » leaving iU) ul ot butier alter the hnal wash. 
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A stock solution of 1 mg/ml fluo-3 is made in l()9r pluronic acid DMSO. To 
load the cells with fluo-3, 50 ul of 12 ug/ml fluo-3 is added to each well. The plate is 
incubated at 37 C in a CO ? incubator for 60 min. The plate is washed four times in the 
Biotek washer with HBSS leaving 100 ul of buffer. 
5 For non-adherent cells, the cells are spun down from culture media. Cells are 

re-suspended to 2-5x10" cells/ml with HBSS in a 50-ml conical tube. 4 ul of 1 mg/ml 
fluo-3 solution in 10% pluronic acid DMSO is added to each ml of cell suspension. 
The tube is then placed in a 37°C water bath for 30-60 min. The cells are washed twice 
with HBSS, resuspended to IxlO 6 cells/ml, and dispensed into a microplate, 100 
10 ul/well. The plate is centrifuged at 1000 rpm for 5 min. The plate is then washed once 
in Denley Cell Wash with 200 ul, followed by an aspiration step to 100 ul final volume. 

For a non-cell based assay, each well contains a fluorescent molecule, such as 
fluo-3. The supernatant is added to the well, and a change in fluorescence is detected. 

To measure the fluorescence of intracellular calcium, the FLIPR is set for the 
15 following parameters: (1) System gam is 300-800 mW; (2) Exposure time is 0.4 

second; (3) Camera F/stop is F/2; (4) Excitation is 488 nm; (5) Emission is 530 nm; and 
(6) Sample addition is 50 ul. Increased emission at 530 nm indicates an extracellular 
signaling event which has resulted in an increase in the intracellular CiV^ 
concentration. 

20 

Example 19: High-Throughput Screening Assay Identifying Tyrosine 
Kinase Activity 

The Protein Tyrosine Kinases (PTK) represent a diverse group of 
transmembrane and cytoplasmic kinases. Within the Receptor Protein Tyrosine Kinase 

25 RPTK) group are receptors for a range of mitogenic and metabolic growth factors 
including the PDGF, FGF, EGF, NGF, HGF and Insulin receptor subfamilies. In 
addition there are a large family of RPTKs for which the corresponding ligand is 
unknown. Ligands for RPTKs include mainly secreted small proteins, but also 
membrane-bound and extracellular matrix proteins. 

30 Activation of RPTK by ligands involves ligand-mediatcd receptor dimerization, 

resulting in transphosphorylation of the receptor subunits and activation of the 
cytoplasmic tyrosine kinases. The cytoplasmic tyrosine kinases include receptor 
associated tyrosine k mases of the src-family (e.g., sre, yes, lck, lyn, fyn) and non- 
receptor linked and cytosolic protein tyrosine kinases, such as the Jak family, members 

35 of which mediate signal transduction triggered by the cytokine superfamily of receptors 
(e.g., the Interleukins, Interferons, GM-CSF, and Leptin). 
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Because of the wide range of known factors capable of stimulating tyrosine 
kinase activity, the identification of novel human secreted proteins capable of activating 
tyrosine kinase signal transduction pathways are of interest. Therefore, the following 
protocol is designed to identify those novel human secreted proteins capable of 
5 activating the tyrosine kinase signal transduction pathways. 

Seed target cells (e.g., primary keratinocytes) at a density of approximately 
25,000 cells per well in a 96 well Loprodyne Silent Screen Plates purchased from 
Nalge Nunc (Naperville, IL). The plates are sterilized with two 30 minute rinses with 
100% cthanol, rinsed with water and dried overnight. Some plates are coated for 2 hr 

10 with 100 ml of cell culture grade type I collagen (50 mg/ml), gelatin (2%) or polylysine 
(50 mg/ml), all of which can be purchased from Sigma Chemicals (St. Louis, MO) or 
10% Matrigel purchased from Becton Dickinson (Bcdford,MA), or calf serum, rinsed 
with PBS and stored at 4°C\ Cell growth on these plates is assayed by seeding 5,000 
cells/well in growth medium and indirect quantitation of cell number through use of 

15 alamarBlue as described by the manufacturer Alamar Biosciences, Inc. (Sacramento, 
CA) after 48 hr. Falcon plate covers #3071 from Becton Dickinson (Bedford,MA) arc 
used to cover the Loprodyne Silent Screen Plates. Falcon Microtest 111 cell culture 
plates can also be used in some proliferation experiments. 

To prepare extracts, A431 cells are seeded onto the nylon membranes of 

20 Loprodyne plates (20,000/200ml/well) and cultured overnight in complete medium. 
Cells are quiesced by incubation in scrum-free basal medium for 24 hr. After 5-20 
minutes treatment with EGF (60ng/ml) or 50 ul of the supernatant produced in Example 
1 1, the medium was removed and 100 ml of extraction buffer ((20 mM HEPES pH 
7.5, 0.15 M NaCl, 1% Triton X-100, 0.1% SDS, 2 mM Na3V04, 2 mM Na4P207 

25 and a cocktail of protease inhibitors (# 1836170) obtained from Boehcringer Mannheim 
(Indianapolis, IN) is added to each well and the plate is shaken on a rotating shaker for 
5 minutes at 4°C. The plate is then placed in a vacuum transfer manifold and the extract 
filtered through the 0.45 mm membrane bottoms of each well using house vacuum. 
Extracts are collected in a 06-wcll catch/assay plate in the bottom of the vacuum 

30 manifold and immediately placed on ice. To obtain extracts clarified by centrifugation, 
the content of each well, after detergent solubilization for 5 minutes, is removed and 
centrifuged for 15 minutes at 4°C at 16,000 x g. 

Test the filtered extracts for levels of tyrosine kinase activity. Although manv 
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biotinylatcd peptide). Biotinylated peptides that can be used for this purpose include 
PSK1 (corresponding to amino acids 6-20 of the cell division kinase cdc2-p34) and 
PSK2 (corresponding to amino acids 1-17 of gastrin). Both peptides are substrates for 
a range of tyrosine kinases and are available from Boehringcr Mannheim. 
5 The tyrosine kinase reaction is set up by adding the following components in 

order. First, add lOul of 5uM Biotinylated Peptide, then l()ul ATP/Mg9 + (5mM 
ATP/50mM MgCb), then lOul of 5x Assay Buffer (4()mM imidazole hyd roehloride, 
pH7.3, 40 mM beta-giyccrophosphate, ImM EGTA, lOOmM MgCb, 5 mM MnClo 
0.5 mg/ml BS A), then 5ul of Sodium Vanadatc( 1 mM ), and then 5ul of water. Mix the 

10 components gently and preincubate the reaction mix at 30°C for 2 mm. Initial the 
reaction by adding lOul of the control enzyme or the filtered supernatant. 

The tyrosine kinase assay reaction is then terminated by adding 10 ul of 120mm 
EDTA and place the reactions on ice. 

Tyrosine kinase activity is determined by transferring 50 ul aliquot of reaction 

I 5 mixture to a microliter plate (MTP) module and incubating at 37°C for 20 min. This 
allows the streptavadin coated 96 well plate to associate with the biotinylatcd peptide. 
Wash the MTP module with 300ul/well of PBS four times. Next add 75 ul of anti- 
phospotyrosine antibody conjugated to horse radish peroxidase(anti-P-Tyr- 

POD(0.5u/mI)) to each well and incubate at 37°C for one hour. Wash the well as 
20 above. 

Next add lOOul of peroxidase substrate solution (Boehringcr Mannheim) and 
incubate at room temperature for at least 5 mins (up to 30 min). Measure the 
absorbance of the sample at 405 nm by using ELISA reader. The level of bound 
peroxidase activity is quantitated using an ELISA reader and reflects the level of 
25 tyrosine kinase activity. 

Example 20: High-Throughput Screening Assay Identifying 
Phosphorylation Activity 

As a potential alternative and/or compliment to the assay of protein tyrosine 
30 kinase activity desenbed in Example 19, an assay which detects activation 

(phosphorylation) of major intracellular signal transduction intermediates can also be 
used. For example, as described below one particular assay can detect tyrosine 
phosphorylation of the Erk-1 and Erk-2 kinases. However, phosphorylation of other 
molecules, such as Raf, JNK, p38 MAP, Map kinase kinase (MEK), MEK kinase, 
35 Src, Muscle specific kinase (MuSK), IRAK, Tec, and Janus, as well as any other 
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phosphoserme. phosphotyrosine, or phosphothrconine molecule, can be detected by 
substituting these molecules for Erk-1 or Erk-2 in the following assay. 

Specifically, assay plates are made by coating the wells of a 96- well ELISA 
plate with 0. 1ml of protein G (lug/ml) for 2 hr at room temp, (RT). The plates are then 
5 rinsed with PBS and blocked with 37c BSA/PBS for 1 hr at RT. The protein G plates 
are then treated with 2 commercial monoclonal antibodies (lOOng/well) against Erk-1 
and Erk-2 ( 1 hr at RT) (Santa Cruz Biotechnology ). (To detect other molecules, this 
step can easily be modified by substituting a monoclonal antibody detecting any of the 

above described molecules.) After 3-5 rinses with PBS, the plates are stored at 4°C 
10 until use. 

A431 cells are seeded at 20,0(X)/well in a 96-well Loprodyne fihcrplate and 
cultured overnight in growth medium. The cells are then starved for 48 hr in basal 
medium (DMEM) and then treated with EGF (6ng/weii) or 50 ul ot the supematants 
obtained in Example 1 1 for 5-20 minutes. The cells are then solubilized and extracts 

15 filtered directly into the assay plate. 

After incubation with the extract for 1 hr at RT, the wells are again rinsed. As a 
positive control, a commercial preparation of MAP kinase (lOng/well) is used in place 
of A43 1 extract. Plates are then treated with a commercial polyclonal (rabbit) antibody 
(lug/ml) which specifically recognizes the phosphorylated epitope of the Erk-1 and 

20 Erk-2 kinases (1 hr at RT). This antibody is biotinylated by standard procedures. The 
bound polyclonal antibody is then quantitated by successive incubations with 
Europium-streptavidin and Europium fluorescence enhancing reagent in the Wallac 
DELEIA instrument (time-resolved fluorescence). An increased fluorescent signal over 
background indicates a phosphorylation. 

25 

Example 21: Method of Determining Alterations in a Gene 
Corresponding to a Polynucleotide 

RNA isolated from entire families or individual patients presenting with a 
phenotype of interest (such as a disease) is be isolated. cDNA is then generated from 
30 these RNA samples using protocols known in the art. (See, Sambrook.) The cDNA is 
then used as a template for PCR. employing primers surrounding regions of interest in 
SEQ ID NO:X. Suggested PCR conditions consist of 35 cycles at 95°C for 30 
seconds; 60 120 seconds at 52-5K n C; and 60-120 seconds at 70 T umh<* buffer 

polynucleotide kinase, employing ScojiuThei m Polymerase. 1 1 epicentre Technologies >. 
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The intron-exon borders of selected exons is also determined and genomic PCR 
products analyzed to confirm the results. PCR products harboring suspected mutations 
is then cloned and sequenced to validate the results of the direct sequencing. 

PCR products is cloned into T-tailed vectors as described in Holton, T.A. and 
5 Graham. M.W., Nucleic Acids Research, 1 9: 1 1 56 ( 199 1 ) and sequenced with T7 
polymerase (United States Biochemical). Affected individuals are identified by 
mutations not present in unaffected individuals. 

Genomic rearrangements are also observed as a method of determining 
alterations in a gene corresponding to a polynucleotide. Genomic clones isolated 

10 according to Example 2 are nick-translated with digoxigemndeoxy-uridine 5*- 

tnphosphate (Boehnngcr Manheim). and FISH performed as described in Johnson, 
Cg. et al.. Methods Cell Biol. 35:73-99 (1991). Hybridization with the labeled probe is 
carried out using a vast excess of human cot-] DNA for specific hybridization to the 
corresponding genomic locus. 

,:> Chromosomes are counterstained with 4,6-diamino-2-phcnylidole and 

propidium iodide, producing a combination of C- and R-bands. Aligned images for 
precise mapping are obtained using a triple-band filter set (Chroma Technology, 
Brattleboro. VT) in combination with a cooled charge-coupled device camera 
(Photometries, Tucson, AZ) and variable excitation wavelength filters. (Johnson, Cv. 

20 et al.. Genet. Anal. Tech. Appl., 8:75 (1991).) Image collection, analysis and 

chromosomal fractional length measurements are performed using the ISee Graphical 
Program System. (Inovision Corporation, Durham. NC.) Chromosome alterations of 
the genomic region hybridized by the probe are identified as insertions, deletions, and 
translocations. These alterations are used as a diagnostic marker for an associated 

25 disease. 

Example 22: — Method of Detecting Abnormal Levels of a Polypep tide in a 
Biological Sample 

A polypeptide of the present invention can be detected in a biological sample, 
and if an increased or decreased level of the polypeptide is detected, this polypeptide is 
a marker for a particular phenotype. Methods of detection are numerous, and thus, it is 
understood that one skilled in the art can modify the following assay to fit their 
particular needs. 

For example, antibody-sandwich ELISAs are used to detect polypeptides in a 
sample, preferably a biological sample. Wells of a microliter plate are coated with 
specific antibodies, at a final concentration of 0.2 to 10 ug/ml. The antibodies are'either 
monoclonal or polyclonal and are produced by the method described in Example 10. 



30 
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The wells are blocked so that non-specific binding of the polypeptide to the well is 
reduced. 

The coated wells are then incubated for > 2 hours at RT with a sample 
containing the polypeptide. Preferably, serial dilutions of the sample should be used to 
5 validate results. The plates are then washed three times with deionized or distilled water 
to remove unbounded polypeptide. 

Next, 50 ul of specific antibody-alkaline phosphatase conjugate, at a 
concentration of 25-400 ng, is added and incubated for 2 hours at room temperature. 
The plates are again washed three times with deionized or distilled water to remove 
10 unbounded conjugate. 

Add 75 ul of 4-methylumbelliferyl phosphate (MUP) or p-nitrophenyl 
phosphate (NPP) substrate solution to each well and incubate I hour at room 
temperature. Measure the reaction by a microliter plate reader. Prepare a standard 
curve, using serial dilutions of a control sample, and plot polypeptide concentration on 
1 5 the X-axis (log scale) and fluorescence or absorbancc of the Y-axis (linear scale). 

Interpolate the concentration of the polypeptide in the sample using the standard curve. 

Example 23: Formulating a Polypeptide 

The secreted polypeptide composition will be formulated and dosed in a fashion 
20 consistent with good medical practice, taking into account the clinical condition of the 
individual patient (especially die side effects of treatment with the secreted polypeptide 
alone), the site of delivery, the method of administration, the scheduling of 
administration, and other factors known to practitioners. The "effective amount" for 
purposes herein is thus determined by such considerations. 
-5 As a general proposition, the total pharmaceutical^ effective amount of secreted 

polypeptide administered parenterally per dose will be in the range of about 1 pg/kg/day 
to 10 mg/kg/day of patient body weight, although, as noted above, this will be subject 
to therapeutic discretion. More preferably, this dose is at least 0.01 mg/kg/day, and 
most preferably for humans between about 0 01 and 1 mg/kg/day for the hormone. If 
30 given continuously, the secreted polypeptide is typically administered at a dose rate of 
about 1 pg/kg/hour to about 50 pg/kg/hour, either by 1-4 injections per day or by 
continuous subcutaneous infusions, for example, using a mini-pump. An intravenous 
bag solution may also be employed. The length of treatment needed to observe changes 

1 'f^uiiia^ c ill i, ^ mi : i h . - m u m > . ni.i.h:,. • t iu- sv. ^ pu »(cm oi me m\ cut ion aic 

administered orally, rectallv, parenterally. intiacistemallv. intra\ a^inalK , 
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intraperitoneal^, topically (as by powders, ointments, gels, drops or transdermal 
patch), bucaliy, or as an oral or nasal spray. "Pharmaceutically acceptable carrier" refers 
to a non-toxic solid, semisolid or liquid filler, diluent, encapsulating material or 
formulation auxiliary of any type. The term ''parenteral" as used herein refers to modes 
of administration which include intravenous, intramuscular, intraperitoneal, intrasternal, 
subcutaneous and intraarticular injection and infusion. 

The secreted polypeptide is also suitably administered by sustained-release 
systems. Suitable examples of sustained-release compositions include semi-permeable 
polymer matrices in the form of shaped articles, e.g., films, or mirocapsules. 
Sustained-release matrices include polylactides (U.S. Pat. No. 3,773,919, EP 58,481), 
copolymers of L-glutamic acid and gamma-ethyl-L-glutamate (Sidman, U. et al., 
Biopolymers 22:547-556 (1983)), poly (2- hydroxycthyl methacrylate) (R. Langer et 
ah, J. Biomed. Mater. Res. 15:167-277 (1981), and R. Langer, Chem. Tech. 12:98- 
105 (1982)), ethylene vinyl acetate (R. Langer et al.) or poly-D- (-)-3-hydroxybutyric 
15 acid (LP 133,988). Sustained-release compositions also include liposomally entrapped 
polypeptides. Liposomes containing the secreted polypeptide are prepared by methods 
known per se: DE 3,218,121; Epstein et aL, Proc. Natl. Acad. Scr USA 82:3688-3692 
(1985); Hwang et al., Proc. Natl. Acad. Sci. USA 77:4030-4034 (1980); EP 52,322; 
EP 36,676; EP 88,046; EP 143,949; EP 142,64 1 ; Japanese Pat. AppK 83-1 18008; 
20 U.S. Pat. Nos. 4,485,045 and 4,544,545; and EP 102,324. Ordinarily, the liposomes 
are of the small (about 200-800 Angstroms) unilamellar type in which the lipid content 
is greater than about 30 mol. percent cholesterol, the selected proportion being adjusted 
for the optimal secreted polypeptide therapy. 

For parenteral administration, in one embodiment, the secreted polypeptide is 
25 formulated generally by mixing it at the desired degree of purity, in a unit dosage 

injectable form (solution, suspension, or emulsion), with a pharmaceutically acceptable 
carrier, i.e., one that is non-toxic to recipients at the dosages and concentrations 
employed and is compatible with other ingredients of the formulation. For example, the 
formulation preferably does not include oxidizing agents and other compounds that are 
30 known to be deleterious to polypeptides. 

Generally, the formulations are prepared by contacting the polypeptide 
uniformly and intimately with liquid carriers or finely divided solid carriers or both. 
Then, if necessary, the product is shaped into the desired formulation. Preferably the 
carrier is a parenteral carrier, more preferably a solution that is isotonic with the blood 
35 of the recipient. Examples of such carrier vehicles include water, saline, Ringer's 
solution, and dextrose solution. Non-aqueous vehicles such as fixed oils and ethyl 
oleate are also useful herein, as well as liposomes. 
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The carrier suitably contains minor amounts of additives such as substances that 
enhance isotonicity and chemical stability. Such materials are non-toxic to recipients at 
the dosages and concentrations employed, and include buffers such as phosphate, 
citrate, succinate, acetic acid, and other organic acids or their salts; antioxidants such as 
5 ascorbic acid; low molecular weight (less than about ten residues) polypeptides, e.g., 
polyarginine or tripeptides; proteins, such as serum albumin, gelatin, or 
immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids, 
such as glycine, glutamic acid, aspartic acid, or arginine; monosaccharides, 
disaccharides, and other carbohydrates including cellulose or its derivatives, glucose, 
10 manose, or dextrins; chelating agents such as EDTA; sugar alcohols such as mannitol or 
sorbitol; counterions such as sodium; and/or nonionic surfactants such as polysorbates, 
poloxamers, or PEG. 

The secreted polypeptide is typically formulated in such vehicles at a 
concentration of about 0.1 mg/ml to 100 mg/ml, preferably 1-10 mg/ml, at a pH of 
1 5 about 3 to 8. It will be understood that the use of certain of the foregoing excipients, 
carriers, or stabilizers will result in the formation of polypeptide salts. 

Any polypeptide to be used for therapeutic administration can be sterile. 
Sterility is readily accomplished by filtration through sterile filtration membranes (e.g., 
0.2 micron membranes). Therapeutic polypeptide compositions generally are placed 
20 into a container having a sterile access port, for example, an intravenous solution bag or 
vial having a stopper pierceable by a hypodermic injection needle. 

Polypeptides ordinarily will be stored in unit or multi-dose containers, for 
example, sealed ampoules or vials, as an aqueous solution or as a lyophilized 
> formulation for reconstitution. As an example of a lyophilized formulation, 10-m! vials 
25 are -filled with 5 ml of sterile-filtered 1 % (w/v) aqueous polypeptide solution, and the 
resulting mixture is lyophilized. The infusion solution is prepared by reconstituting the 
lyophilized polypeptide using bacteriostatic Water-for-Injection. 

The invention also provides a pharmaceutical pack or kit comprising one or 
more containers filled with one or more of the ingredients of the pharmaceutical 
30 compositions of the invention. Associated with such contained s) can be a notice in the 
form prescribed by a governmental agency regulating the manufacture, use or sale of 
pharmaceuticals or biological products, which notice reflects approval by the agency of 
manufacture, use or sale for human administration. In addition, the polypeptides of the 
present invention may be employed in conjunction with other therapeutic compounds. 



35 
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Example 24: Metho d of Treating Decreased Levels of the Polypeptide 

It will be appreciated that conditions caused by a decrease in the standard or 
normal expression level of a secreted protein in an individual can be treated by 
administering the polypeptide of the present invention, preferably in the secreted form. 
5 Thus, the invention also provides a method of treatment of an individual in need of an 
increased level of the polypeptide comprising administering to such an individual a 
pharmaceutical composition comprising an amount of the polypeptide to increase the 
activity level of the polypeptide in such an individual. 

For example, a patient with decreased levels of a polypeptide receives a daily 
1 0 dose 0. 1 - 1 00 ug/kg of the polypeptide for six consecutive days. Preferably, the 

polypeptide is in the secreted form. The exact details of the dosing scheme, based on 
administration and formulation, are provided in Example 23. 

Example 25: Method of Treating Increased Levels of the Polypep tide 

15 Antisense technology is used to inhibit production of a polypeptide of the 

present invention. This technology is one example of a method of decreasing levels of 
a polypeptide, preferably a secreted form, due to a variety of etiologies, such as cancer. 

For example, a patient diagnosed with abnormally increased levels of a 
polypeptide is administered intravenously antisense polynucleotides at 0.5, 1.0, 1.5, 

20 2.0 and 3.0 mg/kg day for 2 1 days. This treatment is repeated after a 7-day rest period 
if the treatment was well tolerated. The formulation of the antisense polynucleotide is 
provided in Example 23. 

Example 26: Method of Treatment Using Gene Therapy 

25 * One method of gene therapy transplants fibroblasts, which are capable of 

expressing a polypeptide, onto a patient. Generally, fibroblasts are obtained from a 
subject by skin biopsy. The resulting tissue is placed in tissue-culture medium and 
separated into small pieces. Small chunks of the tissue are placed on a wet surface of a 
tissue culture flask, approximately ten pieces are placed in each flask. The flask is 

30 turned upside down, closed tight and left at room temperature over night. After 24 

hours at room temperature, the flask is inverted and the chunks of tissue remain fixed to 
the bottom of the flask and fresh media (e.g., Ham's F12 media, with 10% FBS, 
penicillin and streptomycin) is added. The flasks are then incubated at 37°C for 
approximately one week. 
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At this time, fresh media is added and subsequently changed every several days. 
After an additional two weeks in culture, a monolayer of fibroblasts emerge. The 
monolayer is trypsinized and scaled into larger flasks. 

pMV-7 (Kirschmeier. P.T. et al.. DNA. 7:2 19-25 ( 1988)). Hanked by the long 
terminal repeats of the Moloney murine sarcoma virus, is digested with EcoRI and 
HindlH and subsequently treated with calf intestinal phosphatase. The linear vector is 
fractionated on agarose gel and purified, using glass beads. 

The cDNA encoding a polypeptide of the present invention can be amplified 
using PCR primers which correspond to the 5' and 3' end sequences respectively as set 
forth in Example 1 . Preferably, the 5' primer contains an EcoRl site and the 3' primer 
includes a HindHI site. Equal quantities of the Moloney murine sarcoma virus linear 
backbone and the amplified EcoRI and HindHI fragment are added together, in the 

, _ . , „ su .^. , llc ,c.suiung mixture is maintained under conditions 

appropriate for ligation of the two fragments. The ligation mixture is then used to 
transform bacteria HBIOl, which are then plated onto agar containing kanamycin for 
the purpose of confirming that the vector has the gene of interest properly inserted. 

The amphotropic P A3l7 or GP+am 1 2 packaging cells are grown in tissue 
culture to confluent density in Dulbecco's Modified Eagles Medium (DMEM) with 10% 
calf serum (CS), penicillin and streptomycin. The MSV vector containing the gene is 
then added to the media and the packaging cells transduced with the vector. The 
packaging cells now produce infectious viral particles containing the gene (the 
packaging cells are now referred to as producer cells). 

Fresh media is added to the transduced producer cells, and subsequently, the 
media is harvested from a 10 cm plate of confluent producer cells. The spent media, 
containing the infectious viral particles, is filtered through a millipore filter to remove 
detached producer cells and this media is then used to infect fibroblast cells. Media is 
removed from a sub-confluent plate of fibroblasts and quickly replaced with the media 
from the producer cells. Tins media is removed and replaced with fresh media. If the 
titer of virus is high, then virtually all fibroblasts will be infected and no selection is 
required. If the titer is very low, then it is necessary to use a retroviral vector that has a 
selectable marker, such as nco or his. Once the fibroblasts have been efficiently 
infected, the fibroblasts are analyzed to determine whether protein is produced. 

The engineered fibroblasts are then transplanted onto the host, either alone or 
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Example 27: Method of Treatment Using Gene Therapy - In Vivo 

Another aspect of the present invention is using in vivo gene therapy 
methods to treat disorders, diseases and conditions. The gene therapy method 
relates to the introduction of naked nucleic acid (DNA, RNA, and antisense 
5 DNA or RNA) sequences into an animal to increase or decrease the expression 
of the polypeptide of the present invention. A polynucleotide of the present 
invention may be operatively linked to a promoter or any other genetic elements 
necessary for the expression of the encoded polypeptide by the target tissue. 
Such gene therapy and delivery techniques and methods are known in the an, 
10 see, for example, WO90/1 1092, W098/1 1779; U.S. Patent NO. 5693622, 
5705151, 5580859; Tabata H. et ah (1997) Cardiovasc. Res. 35(3 ):470-479, 
Chao J et al. (1997) Pharmacol. Res. 35(6):5 1 7-522, Wolff J. A. (1997) 
Neuromuscul. Disord. 7( 5):3 14-3 1 8, Schwartz B. et al. (1996) Gene Ther. 
3(5):405-41 1, Tsurumi Y. et al. (1996) Circulation 94( 1 2):328 1 -3290 
15 (incoiporated herein by reference). 

The polynucleotide constructs of the present invention may be delivered 
by any method that delivers injectable materials to the cells of an animal, such 
as, injection into the interstitial space of tissues (heart, muscle, skin, lung, liver, 
intestine and the like). These polynucleotide constructs can be delivered in a 
20 pharmaceutically acceptable liquid or aqueous carrier. 

The term "naked" polynucleotide, DNA or RNA. refers to sequences 
that are free from any delivery vehicle that acts to assist, promote, or facilitate 
entry into the cell, including viral sequences, viral particles, liposome 
formulations, lipofectin or precipitating agents and the like. However, the 
25 polynucleotides may also be delivered in liposome formulations (such as those 
taught in Feigner P.L. et al. (1995) Ann. NY Acad. Sci. 772:126-139 and 
Abdallah B. et al. (1995 ) Biol. Cell 85(1): 1-7) which can be prepared by 
methods well known to those skilled in the art. 

The polynucleotide vector constructs of the present invention used in 
30 the gene therapy method are preferably constructs that will not integrate into the 
host genome nor will they contain sequences that allow for replication. Any 
strong promoter known to those skilled in the an can be used for driving the 
expression of DNA. Unlike other gene therapies techniques, one major 
advantage of introducing naked nucleic acid sequences into target cells is the 
35 transitory nature of the polynucleotide synthesis in the cells. Studies have 
shown that non-replicating DNA sequences can be introduced into cells to 
provide production of the desired polypeptide for periods of up to six months. 
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The polynucleotide construct of the present invention can be delivered to 
the interstitial space of tissues within the an animal, including of muscle, skin, 
brain, lung, liver, spleen, bone marrow, thymus, heart, lymph, blood, bone, 
cartilage, pancreas, kidney, gall bladder, stomach, intestine, testis, ovary, 
5 uterus, rectum, nervous system, eye, gland, and connective tissue. Interstitial 
space of the tissues comprises the intercellular fluid, mucopolysaccharide matrix 
among the reticular fibers of organ tissues, elastic fibers in the walls of vessels or 
chambers, collagen fibers of fibrous tissues, or that same matrix within 
connective tissue ensheathing muscle cells or in the lacunae of bone. It is 
10 similarly the space occupied by the plasma of the circulation and the lymph fluid 
of the lymphatic channels. Delivery to the interstitial space of muscle tissue is 
preferred for the reasons discussed below. They may be conveniently delivered 
by injection into the tissues comprising these cells. They are preferably delivered 
to and expressed in persistent, non-dividing cells which are differentiated, 
15 although delivery and expression may be achieved in non-differentiated or less 
completely differentiated cells, such as, for example, stem cells of blood or skin 
fibroblasts. In vivo muscle cells are particularly competent in their ability to take 
up and express polynucleotides. 

For the naked polynucleotide injection, an effective dosage amount of 
20 DNA or RNA will be in the range of from about 0.05 g/kg body weight to about 
50 mg/kg body weight. Preferably the dosage will be from about 0.005 mg/kg 
to about 20 mg/kg and more preferably from about 0.05 mg/kg to about 5 mg/kg. 
Of course, as the artisan of.ordinary skill will appreciate, this dosage will vary 
according to the tissue site of injection. The appropriate and effective dosage of 
25 nucleic acid sequence can readily be determined by those of ordinary skill in the 
art and may depend on the condition being treated and the route of 
administration. The preferred route of administration is by the parenteral route of 
injection into the interstitial space of tissues. However, other parenteral routes 
may also be used, such as, inhalation of an aerosol formulation particularly for 
30 delivery to lungs or bronchial tissues, throat or mucous membranes of the nose. 
In addition, naked polynucleotide constructs can be delivered to arteries during 
angioplasty by the catheter used in the procedure. 

The dose response effects of injected polynucleotide in muscle in vivo is 



either circular or linear, is cither used as naked DNA or eomplexed with 
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liposomes. The quadriceps muscles of mice are then injected with various 
amounts of the template DNA. 

Five to six week old female and male Balb/C mice are anesthetized by 
intraperitoneal injection with 03 ml of 2.5% Avertin. A 1 .5 cm incision is made 

5 on the anterior thigh, and the quadriceps muscle is directly visualized. The 

template DNA is injected in 0.1 ml of carrier in a 1 cc syringe through a 27 gauge 
needle over one minute, approximately 0.5 cm from the distal insertion site of the 
muscle into the knee and about 0.2 cm deep. A suture is placed over the 
injection site for future localization, and the skin is closed with stainless steel 

10 clips. 

After an appropriate incubation time (e.g., 7 days) muscle extracts are prepared 
by excising the entire quadriceps. Every fifth 15 um cross-section of the individual 
quadriceps muscles is histochemically stained for protein expression. A time course for 
protein expression may be done in a similar fashion except that quadriceps from 
1 5 different mice are harvested at different times. Persistence of DNA in muscle following 
injection may be determined by Southern blot analysis after preparing total cellular DNA 
and HIRT supernatants from injected and control mice. The results of the above 
experimentation in mice can be use to extrapolate proper dosages and other treatment 
parameters in humans and other animals using naked DNA of the present invention. 
20 It will be clear that the invention may be practiced otherwise than as particularly 

described in the foregoing description and examples. Numerous modifications and 
variations of the present invention are possible in light of the above teachings and, 
therefore, are within the scope of the appended claims. 

The entire disclosure of each document cited (including patents, patent 
25 applications, journal articles, abstracts, laboratory manuals, books, or other 

disclosures) in the Background of the Invention, Detailed Description, and Examples is 
hereby incorporated herein by reference. 
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S eque n c e L is t i n g 

(1) GENERAL INFORMATION : 

(l) APPLICANT: Human Genome Sciences, Inc., et al . 
(ii) TITLE OF INVENTION: 2C7 Human Secreted Proteins 
(ill) NUMBER OF SEQUENCES : 800 

(iv) CORRESPONDENCE ADDRESS : 

(A) ADDRESSEE: Human Genome Sciences, Inc. 

(B) STRf'F.T- 9410 Key West Avenue 

(C) CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY: USA 

(F) ZIP : 20850 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 1 . 4Mb storage 

(B) COMPUTER : HP Vectra 4 8 6/33 

(C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTWARE : ASCII Text 



si) CURRENT APPLICATION DATA: 
{ A ) A P PL I CAT I ON NUMBER : 

(B) FILING DATE: 

(C) CLASSIFICATION: 



WO 98/54963 



PCT/US98/11422 



266 



10 



15 



( v i i l ) ATTORNEY / AGENT 1 1 IFORMAT I ON : 
{A) NAME: Ken ley K. Hoover 

(B) REGISTRATION NUMBER: 40,302 

(C) REFERENCE .'DOCKET rJUMBER : PZ007PCT 

(vi) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (301) 309-8504 

(B) TELEFAX: (301) 309-8439 



20 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 733 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
CD) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

30 

GGGATCCGGA GCCCAAATCT TCTGACAAAA CTC AC ACATG CCCACCGTGC CCAGCACCTG 60 

AATTCGAGGG TGCACCGTCA GTCTTCCTCT TCCCCCCAAA AC C CAAGG AC ACCCTCATGA 120 

35 TCTCCCGGAC TCCTGAGGTC ACATGCGT-SG TGGTGGACGT AAGCCACGAA GACCCTGAGG 180 

TCAAGTTCAA CTGGTACGTG GACGGCGTGG AGGTGCATAA TGCCAAGACA AA'^CCGCGGG 240 

AGGAGCAGTA CAACAGCACG TACCGTGTGG TCAGCGTCCT CACCGTCCTG CACCAGGACT 300 

40 

GGCTGAAT3G CAAGGAGTAC AAGTGCAAGG TCTCCAACAA AGCCCTCCCA ACCCCCATCG 360 

AGAAAACCAT CTCCAAAGCC AAAGGGCAGC CCCGAGAACC ACAGGTGTAC ACCCTGCCCC 420 

45 CATCCCGGGA TGAGC TGAC C AAGAACCAGG TCAGCCTGAC CTGCCTGGTC AAAGGCTTCT 480 

ATCCAAGCGA CATCGCCGTG GAGTGGGAGA GCAATGGGCA GCCGGAGAAC AACTACAAGA 540 

CCACGCCTCC CGTGCTGGAC TCCGACGGCT CCTTCTTCCT CTACAGCAAG CTCACCGTGG 600 

50 

ACAAGAGC AG GTGGCAGCAG GGGAACGTCT TCTCATGCTC CGTGATGCAT GAGGCTCTGC 660 

ACAACCACTA CACGCAGAAG AGCCTCTCCC TGTCTCCGGG TAAATGAGTG CGACGGCCGC 72 0 

55 GACTCTAGAG GAT 73 3 



60 (2) INFORMATION FOR SEQ ID NO : 2: 
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li) SEQUENCE CHARACTERISTICS: 

{A; LENGTH: 5 amino acids 
(E) TYPE: amino acid 
■. D i TOPOLOGY : i i near 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO 



Trr> Ser Xaa Trj Ser 
10 1 5 



15 (2) INFORMATION FOR SEQ ID NO: 3: 

( l ) SEQ FENCE CHARACTERISTICS : 

(A) LENGTH: 3 5 base pairs 

(B) TYPE : nucleic acid 
20 ( C ) S' l H> w DEDNE S S : do ub 1 e 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
25 GCGCCTCGAG ATTTCCCCGA AATCTAGATT TCCCCGAAAT GATTTCCCCG AAATGATTTC 60 
CCCGAAATAT CTGCCATCTC AAITAG 86 



30 

{2) INFORMATION FOR SEQ ID NO: 4: 

(l) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH : 27 base pairs 

IB) TYPE : nucleic acid 

IC) STRANDEENESS : double 
(D) TOPOLOGY . lineai- 

40 Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

GCGGCAAGCT TTTTGCAAAG CCTAGGC 2 7 



45 

(2) INFORMATION FOR SEQ ID NO: 3: 

( i ) SEQUENCE CHARACTERISTICS : 
50 (A) LENGTH: 271 base pairs 

(3) TYPE : nucleic acid 

( C ) S rRANDEDNESS : double 

(D) TjPODQGY : 1 mear 



60 
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GOCCCTAACT CCGCCCAGTT CCGCCCATTC TCCGCCCCAT GGCTGACTAA TTTTTTTTAT 18 0 

TTATGCAGAG GCCGAGGOCG CCTCGGCCT 2 TGAGCTATTC CAGAAGTAGT GAGGAGGCTT 240 

5 TTTTGGAGGC CTAGGCTTTT »3CAAAAAGCT T 271 



10 (2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQ'JENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 
15 ( C ) STRANDEDNESS : double 

< D ) TC POLOGY : linear 

(xi) SEQUENCE EESCFIPTIGN: SEQ ID NO: 6: 
20 GCGCTCGAGG GATGACAGCG ATAGAACCCC GG 3 2 



22) ( 2 ) INFORMATION FOR SEQ ID NO : 7 : 

. i ) SEC UENCE CHARACTER I ST ICS : 

(A) LENGTH: 31 base pairs 
\B) TYPE: nucleic acid 
30 ( C ) STRANDEDNES S : d o ub 1 e 

(D) TOPC'LOGY: linear 

; x i ) S E. QUE! ;C E DESCRI PTT ON : SEQ ID NO: 7 : 



35 



GCGAAGCTTC GCGACTCCCC GGATCCGCCT C 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 8: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 
45 (3) TYPE: nucleic acid 

( C ) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 3: 
GGGGACTTTC CC 12 



55 



(2) INFORMATION FOR SEQ ID NO : 9: 



( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 73 base pairs 
60 (B) TYPE: nucleic acid 
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( C ) STRANDEDNESS : doubl e 
{ D ) TOPOLOGY 1 moar 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 9: 

GCGGCCTCOA CC^GGACTTTC CC:»:>ACTT TCCOGGGACT TTCCGGGACT TTCCATCCTG 6 0 

CCATCTCAAT TAG 7 3 



10 



(2) INFORMATION FOR SEQ ID HO. "10: 

15 U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 256 base pair: 
(H) TYPE: nucleic acid 

(C) STRANDECNES3 : double 

(D) TOPOLOGY linear 

20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CTCGAGOG }A CTTTCCCG03 GACTITCCGG GGA2TTTCCG G 2 ACTTT C CA TCTGCCATCT 
25 CAATTAGT' 2 A < SCAAC CAT AG T 2CCSCCCCT AACTCCGCCC ATCCCGCCCC TAACTCCGCC 
CAGTTCCGOC 2ATTCTCCGC CCCATG3CTS ACTAATTTTT TTTATTTATG GAG A G SCC GA 
GGCCGCCTCG GCCTCTGAGC TATTC2AGAA GTAGTGAGGA GGCTTTTTTG GAGOS STAGS 
C TTTTG>2 A AA AAGCTT 2 S 6 



(2) INFORMATION FOR SEQ' ID NO: 11: 



{ v ) SEQUENCE CHARACTER I ST ICS ; 

(A) LENGTH: 2 5 26 base pail 
40 (B) TYPE: nuclei, acid 

(C) STRANDEDNE3S : double 

(D) TOPOLOGY: linear 



60 
120 
180 
240 



(::i) SEQUENCE DESCRIPTION : SEQ ID NO: 11: 
G AC ACrGCT AT CCGAGAATCT GAGAGC'rax; CCCGGCAAI-r CCTCCACrrrA CCCTTGTGAC 
CTAAGTCCAG TCACACATTT C 2C AAAGTTT C T C TTTGTC A TAACCCTGGT CTGGCTGGTT 
50 TTGR2GRCTT GACAATGJGT CAG^GACTCC AG^CCAAGTC CAACAGAGAC CC 2AAACCCA ISO 
CCACACACCA GC AGS C AC AA COT' SAC 2 AOS AACAAAGAGG ACTTTTGTGG OG2CACAAGT 240 



20 
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25 



35 



GGCCA3GC2C CCAGCGACTC TTCTTGGCCT GATGTTTGTC C TC AX ~' AGGC A TOG C AC GTGG 540 

CCTGAGATGA TTCAGAACAA ATCATGCTAA CTTTGAATCC ATCCAGCCAC TT3CAAATGA 600 

5 

TAATCAGAAG TCAGCTTGTT CACTGTTAGA AAGAAACTAA CAAAAGAGAA CCCAGAGCAA 

TCTAGAATCT TTGAGTG2TT GG2TTTCCAA G3ATACTG2G GAGACTCTGG CCAA-3CTGAT 7 2 0 

10 gamcttgtga ARTGTCACTG G2AC2ATATG CAACAAGAAC CACCATTCAC TGAGTAGCTA 7 30 

ATCGGTTTOS GGCCT3GGAC ATTCCATCTG AGGTCCTTCC TGAACATGTC ACTCCACAGC 840 

AGAGGACO0G TTGCAGCTTA CCCAGAACCA CTCCTCCAGG ACAGCTGGAT GTTTTGCGTG 900 

15 

CAACACCTTG AGCACT'SACT GC T ATTGTTC AAAAAAAGCG TTTGCTGCAT TCGGAGGACT 96l) 

GCCCCGTGCC CTGAGGTGAC TTCCTAACTA TGTGGTTTCA TTAGCGAATT TATTTTTTGT 102'.) 

20 G2TGGGTGGA CATTTGTATT TTGTTAGGTT GCTGTTTAAG CTCAA3TTTG CTGTGCTCTC 10 8 0 

TGCAG2TACA AAAC:ATCTTG GGATATTTAA GAKTGGCTTT TATAAATAGC TTTATTCTGA 114) 

TATTAATCAG ATTCCCAACT TTACTGAGAA 1TAAGGACTG GGCTACTTTA AAGAAATGCA 12 00 

AATAGGAATT GAAGAACCAC TGCTGCAGGT GGTAGCCCTG GCTAGACTGA ATT AC A' .TAG 12 £0 

AAATCAG2CA GAAGGAAGCG TCCTTGGGAT C2CAGATCAC TCTTTTTTTT 1TTTTTTTTA 132 0 
30 AAAGX^GCAG CCG 1 LTTGATG GCTCATCTCT CTGAATAACA GTTACGTCTT CAT AT C 'GAT A 



1380 



CCAGATGCCT TCTTCATCAT GCCACTGAAG CCACTCACCA CCTT 7AAGAA CATGCCAACC 1440 

TCTGTCAGAT TCAGTTACCC A< 2AAAC AAGG AGGC ACGTT T GO:ACAAAGT GTTGTCGTC 2 1500 

AGGTCCAAGT GGAGTCTAC:A GAGTGCTTGA CCTCAACACA CTA^\TT02A GGTG3AGTGG 15-0 

ACCAAGAGCA GGG AAAGAC A CGGGAACTGA AAAACTCCAC A GG 3TTTG GA GAATAG.AAAT 162 0 

40 GAAAAGCCAC GTC AT ATAAC TCAAGAATAA ATGGTGTTTT GGAAATTTTA AAATTATCAT 1680 

CGAAGGTGGT GAAACTATTT CAGGCCCAAA TGAAAGGAAA TCGC C AGTTG GGGATGAAAT 1740 

CACAGAG2CT GTGTTTTATG ATATGGTTGG ATGTCCACTG ATG AAATTTT AAAGGAGTTT 180 0 

45 

CATTTTTAAA AGTGCGCATG ATTCTACATA TGAGAATTCT TTAGGCCAAG AAACTGTCCT I86 0 

TGGCTCAGAG GTCTTGGGAA TTAAA<OCAGA GAGAAGCCAT TCGTGATGCT TAGAACCAAG 192 0 

50 GATGGTCATG TACACAAAGA CCATCGAGAC GGCCATTCTT GTTTACAAAA CACTTACCAA 1980 

GAAAGCACTT TGTAGGGGAA CTTTAGTAAG TTCTTCTCAT TTCATTATGT TTCTTCCAAG 2 04 0 

GAAACAGGAG AGACTGAATT AATAATTCTC TCTTTCCTCT TAAGCACTTT TAAAATAATA 2100 

55 

AAGTACATCT TGAAATTTGG GGGGGCATCT CTGATTTAAA AAAAGAAAAA GGCTGCTTGA 2160 

TGTATGTTAT GCAGAGACAC TCTG2CTCTG GTGCCTGC AG AGC AAT AC C C AAGCCTCATT 2220 

60 TGGAAGGCTC AACATTTGGA ATPG2ACTTT AATTGATTAA TCCTCAATTC ATGTGCCCTT 228 0 
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ACGGGATGGT GGGTirTGGGA CCCCAATTCA TTCTTATCTG CCAAAGAATT ATCTAGAAGC 2 340 

ACAT C AAAT A CCAGCACCCC ACCTGCACAA TO3GGGTGGA AAACTTTTGT ATCCCTAAGC 2400 

5 

ATATTATTTT ATAGTGTCTG CCATGCCATG TGGAAATACT TT ATTTTT AA CCTCAGGATT 246 0 

TAAATAAAGT AAACACTATG ACATTTAAAA AAAAAAAAAA AAAACTCGAG GGGGGCCCGG 2 520 

10 TAGCCA 2 52 6 



15 (2) IIJFOEMATION FOR SEQ ID NO: 12: 

( i ) SEQUENCE CHARACTERISTICS : 

iA) LENGTH; 1131 base pairs 
iB) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

( D ) TO POI .OGY : linear 

{ x : ) Z E03 JENC E DESCRIPTION : S EQ I D NO : 12: 

25 CACTGCACCA GCTTTGTTAT CTGTAAAATG ATG AT AAT AC CAACACCTTC TTCTTGOOGT 60 

A'TTGAAGATG AGAGAACATG AT ATG TGT AA AGTGCCTTCC ACAATACCCA GAACAT A 3C A 12 0 

AACATGTAAT GAATGTAGTA ATAGTAATTA TTTTATTTTC TTTTGATTCA GTTGGGACTA 180 

30 

TGTTCAGCTG T AACAG AAT A CCCAAAATAA CTGTTTTAAA CAAATTAAAG TTTWGTTGTG 240 

AAGTTTTGTT ACGAATTCAG ACAATCCAGG GCTTTTATAG ATGCACCAG3 ATCAGCAGGT 300 

35 A2AAA G3C AT CTTTCCTGAT TTCTGC CAGT CTCAATGCAT GGGTTGCAAr CCAG.^TCCA 36 0 

R3AT3GCAGT TCCAGCCCTG GTTACGCCCA T ATT AGC AC A CAGAAAGAAA G AG AAAGO 3 A 42 0 

TGTGCCTCTT CACTTTAAT2 ATAGCTCCCA CTAGATGCAC CCACTACTT 2 TGCTGATACT 4 30 

40 

CCATTAGCTA ATGCTTGCTT ACATGGTCAC ACTTAGTTTC CAGAGAGACA TGTCTGGACA 540 

GTCAT^TGCT CAATTAATAT CCAAGTGTCC AATTACTGA3 AAAAAAA* 3 AA ACTA:3CAC0T 6 DO 

45 TOGCTTOGIT GCA7TCCTCT TAGCATAAGC CACATTCTTT TTATGAAGTT G r I 1 C CT C A ! 3 T^I 1 3 60 

A 3TTG3 ATGC CTCAGTTGTC CTTTCAWTTA GAAAWOCYCC TKGGAC AY' *"C TGAAV.'CTGAC 720 

TTCTTTTGTO ATCAGCACCA TCACTACCAC TGCCYTCTTC AAAGCCACCA CGTTCTGTCC 7 30 

50 

CCAGGATGGT TGCAACAACC ACCATAGGGA CTTTTTGCCT TCT ACTTC 1 2 A CACAATAGNC 84 0 

CAGAGTAAGC TTTTGAAAAT GTAGGTCAGA TCATGT2TCT CTCTTCCTCT TCAAAACCCT 90 3 



60 
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TATTTGCACT TAAAATAGAA AAAAAAAAAA AAAAAAAAC T COAGGGGGGC C 1131 



5 

(2) INFORMATION FOR SEQ ID NO : 13: 

il) SEQUENCE CHARACTERISTICS : 

lA) LENGTH: 941 base pairs 
10 (B) TYPE: nucleic acid 

!C) 3TRANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

15 

GGCACGAGTA GCATTTCATT TAATCTGCAG GTATATTCTC CCAACAGTTT ATTGTCATGT 60 

G ATG TC CT ''. ' A G3CAAGATTG TRA3GCAGAG AGGAGCTGTC CCAACCTACT ATACCACCGA 12 0 

20 GGCTGGAGAG ATC AT ATT* IT TG3TATTAAA CTGGAGT2TC TCCATCCTTC AC ATT GTTGA 130 

TGTCCTCTGT A-X.^AACGX AAAAGTCAGT GACAGAAC-AT GCCGCTAGCG GTTTGAGCCA 2-10 

G AG AAT* SAC A GCTCTGGTTT GGAGAAAAGG GCCGGATSGT G3CTCTAGAA AGCCCATCCT 3 00 

25 

TCTGCTCTTC TTTTTTCTCC C ! "J C TT AT ATT GT3CTTTCAT T2ATTCATTC ATTCATCAAA 360 

CATTTGTTGA GOAT CT ATT A TGT3TCAAGC TC TGTGCT AG CCTCTGGAAA ACCTGCCCTC 4 20 

30 ATGTAGCTCA CTGTGGAGTA '3GAGAAACAA TGACTACACT ATGATAAGCA CGGGTTGTCA 480 

GGGTCTCACA 3Ai3CAGTGGC C3CTCATC3A GACCGATGAG GTCAAAGAAG GCATCCAGGC 540 

GAGGATGGTG TCA: 1AC-CTAA CTGAAGAATG AGAGGGAGCT GCACCA3CAG GGGTTGGAAC 600 

35 

TGAAGGTGGC AGTGCCT3GA GTCTTGATTC CAGCAGAGGG AGAGCAGTCT GT G AAAAGGC 660 

ACCAAGGGTG GGAGAGGGCA GAGCACATGG AGG AACTT 3 A GGTAGTTCTG GATGG2SCTG 720 

40 GGGCAAAGCT A3A3A-3GTAA G AAGAATCT A CAAATGTT2C TC G AGTTAC A TG AACTT 3 C A 780 

TCCCAATAAA CCCATTGGAA ACGAAAAATT TAAGTCAGAA CTGCATTTAA GGCTGGT3CG 840 

AGTAGAATGA TTTTTACAAC GAATTGATCA CAACCAGTTA CAGATGTCTT TGTTCCTTCT 900 

45 

CCACTCCCAC T3CTTCACCT G ACTA 3CCTT TAAAAAAAAA A 941 



50 



60 



(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS * 

(A) LENGTH : 843 base pairs 
55 ( 3) TYPE : nucleic acid 

(CJ STRANDEDNESS : double 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
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CNAGGGATAA CCCCAAAGNT GGGAAATAAA CC3TCAATTA AAGGGGGAAC CAAAAAGCTG 6 0 

GGAAGTTCCC CCCCGCGGTG GOIGCCNGNT CT A GG AAC T A GTGGAATCCC CCGGGGCTOC 12 C 

AGGGAATTCG GCA03GAGTG GGAATGTTGT TTGTATGATA CTATTTCCAC AAWATGCATT 130 

GAGACTTGGT KT3T3GCCTA '3GACATOGTC AACTCTTTYT AAATATTC CG TC AATTT C TT 240 

TAGTGCATAT TCrC3GAT3G >3GG3TGTGGG GACAGAGTTC TAAATATCCC CATTAGATTA 3 00 

AATCTCTTCA TTCT3TTG3T CA3ATCTTCT ATATCCTTAT TAATCTGTGA ATCTCTTCAA 3 60 

GAGAGGTGTT ATTAAAATCT 3TCACTGTAT GTGTCACTTT Q 3 C CTT AAAA TTCTGATGAT 42 0 

TTGCTTTATA AATGGTTATA A3CATTTTCC AGG A AG AAC A TTAAAGAAGT TTCCATTGGC 430 

A r ITATCCAGT TT C COT 3AAA ATACTG3TTT TTTTTATTTT GGCTNCTAAG CAGCTATGAA 34 0 

TCCAGTTTCT CAGAAGCCCT T3T2TCAAGG CATTTGTTTC CAG ATT AC GT TGTTAGCATC 500 

CAGACTATGG GC TATTTT AG A. AAAAG AAAA AAAG T AT C AA AAT CAT AT AG CTATGATTTT 550 

CCTGTGCTTG AAGGAG2CTT AAAGCTCATC TAGTCCAGC2 A GT ATT TG TT CATCCAAATT 720 

25 CTGCCAAGAA ATCTCTATTG TGAAGATATT CTTTACCATC TTTGGG A 3 AT TCTCATTATT 7 80 

AGAAACAAAT CCTAAGAAGA AATTCTGCCA TAJLACAACCC ATCCGTT3TT TAAAAAAAAA 3 40 

AAA 343 



on 



30 



35 



(2) INFORMATION FOF SEO ID NO: IS: 



(i) SEQUI.NCI: CHARACTERISTICS: 

(A) LENGTH 1018 base pairs 
(B' TYPE : nucleic acid 
( C } 3TRANOEDNE3S : double 
40 (D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

:iotaatttt TAATTTTGAT ATACCGTGCT TTGATTCTAA TTTTATTTTT TOACTTCTCT 60 

45 

GAAC;GTT ACA TAT AC AGAGT GCTTOAGGAA T3ATCATTTT GTTATTATTC AT 3C TT C TT A 1 13 

ACAATGTTGT TTTAC TCCAA GAA'GATAATT GCCAGAGAAA GAATACAGTG CA3GAAAGAA 130 

50 GARGCTGGAG 02AGTGGCGA AGARGGATTG AGAF.GACAGA CATTGTG:>GA AT 3 AAATCAT 24 0 

GAATAATCGT GTTTTTGAAT TOTCCAAAAA CTTCTACAAA CCATGAAATG TTGGAGTTTA 3 00 
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15 



30 



50 



AC-GAPGTCAA AAILLAAAAAC GTGTTTCTGA AACTCAAGGA ATTAATGACA CATAGGGAAG 
TTT"L r TGC'L»T ATTAAGCATA GAGTAGGAGA GGCAAGTCAA GAATAAAAAA AAAAAAAA 



C) r : j?:h:-'Ati:m pgr leg il no: 16: 

; l } IF.; I JEN CP G I iARACTER I STICS : 

(A) LENGTH: 661 base pairs 
25 (P> TYPE: nucleic acid 

( G ) JTRAIJDEDNESS : double 
; :) ) r r PO L 0G Y : linear 



N 

55 

(2) INFORMATION FOP SEQ ID NO: 17: 
60 (l) SEQUENCE CHARACTERISTICS: 



60 0 



TGCtCTATCTA ATTTGGTOCC AAATAGTTAA TGTGCTTGAA TTTAAAAACA GCAAACATGT 
AGAAAGGTAA TT,\TAATTAT GAGG2 TAGTT CTTT AAGC TA Q2TTTTTTTC CCCTCTCAAA 600 

5 

CAGLATATTG GCTT>3GATGT C AG LA GGAG A AAG TGTTTTT TOCAATACAC ATAATGCATA 71: 0 

TAT3GTGCTG TTAG7AATCT ATAGAAAATA GATATTGCTC ATT AAGGT AA ATATTTTTGT 
10 TGAT3AATGA TGTG3AATG j TCTG3ACTTG TT3TGTGAAC A3GAAATTGC TCTGTA3GCT 



780 
84 0 



rPAAAGAG T3AGGCTGGT AAGATTAATT AAAGT AAAT A CTGTGACAAT 90" 



96 0 
1013 



:xd :l:':e: vri: DESCRIPTION: SFQ ID NO: 16: 

ttty^agaaat tagtgaatcc ccggi jtgcag ggaattcggc ACGAGGAGGA GC XT CGTC AGC 60 

TGGCAGGAGC GGAGGATVXVG AGCTGYTCCC ccqxtttgca cccccccagy tctgctggac 120 

35 ATAACATGGT T.AVLAGAGAG CCTG3GAGCT CXJGGAGCCTG TACCTGTGGA GTGCCGGCAC ISO 

O3CCTGGAG0 TGGGTGG^CC AAG<3AAGGGG CCTCTGAGCC CAG2ATGGAT GCCTGCCTAT 240 

GOCTG~GAGC GCCCTAC'OOC CCTCACACAC CACAACACTG GCCTMTCCGA (3CTGCTGGAG 3 00 

40 

CATG3AGTGT GTGA3GAG3T G 3AGAG AGTT C3GCGCTCAG AGAGGTACCA GACCATGAAG 3 60 

GTGCGCAGGG LAGGGCTC\3G ACCTACCCCA G3AATGTCCT GCCCTGGGAA TGACAACACA 420 

45 GTCCACACCA TGTACGSGGA GG2AAACAGG G3CA3CTGAC CCAGCCCAGG GGTCAGANGA 4 30 

GGTCTTGCCG A3GAAGTGGC AG3TAA<3CTG ATACCTGATA TGCACWAGKC AGCCARGYGG 540 

AGACAGGCAA G 3AAG AAG 3T TGTTTTGAGG AC AGAATTTT CTAGATCACT CAGCACCATC 600 

TGGCTTTTGG G32TTTTT3T TTTATTTTGT TTTTGAGACG GGGTCTCGCT CTGTCGCCCA 660 



661 



BNSCK X : I ; j <. WO 9854963A? 



WO 98/54963 PCT/US98/1 1 422 

215 

(A) LENGTH: 553 base pairs 

(B) TYPE: nucleic acid 
(C ) STFATIT EDNESS : double 
- 1.) } TOPOLCGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ II NO : 17: 

GGCACAGGGC T ATTTGC C C C TCTCTCCACA TGA<2AGAACT OCT C T AAGTT TCTTTGCTGC 

TCTTCTCAG2 TGT2AGA02G CTTGCTG2TT GTTTTC C" AC A CCA2CATGTC TATTCTTTQC 

TGTCCTTWAC TCTGCCT^TT TTTITCOT TGTATTTCTT CTOG2TCTTG TCCCTTTTCC 

CACGTGTCW2 AGCTTTC2TT TATTCCCACT TT 2 AG T CAG A GCAGTCCTGT G2TTCTGGTG 

CCGOCATACA ATACTTA 2TT GAGTTTCTTG GCTTTTCTTG ACTGTGCATC TCTTACTTCA 

ACATAGGAAT AGCCTGTCAT AG AATTTC T C CA STTCOAGG GCTCAAGAGG GAGAGTGCCA 

_ _ _ _ . , „ m * ^ » —r- t\ r\ -\ A r- r> 7\ ."TV^TTTr^T' d "> 0 

20 GAAAATTGAG ACT GTTTTC C CiVlvriu.A n'^n^^ ^w^—- ~ — *~ 

G'rGAGGGTTP G 2TGTGTCAT GCCTATAGGT TGTTTO3GTG CAAACCTATA GAATCCAGCC 480 
TOGAAM( ; A AAGFAACCAG AGAATANCAG CAT CAG AAC A ATGCTTGACA TCATTTCTCA 540 
ATCAAGCA "T CCA 



10 



15 



6 0 
120 
180 
24 0 
3 00 
360 



25 



553 



30 



40 



45 



50 



(2) INFORMATION FOE SEQ ID IJC 



: :i } SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 86 9 base pair < 
35 (3) TYPE: nucleic acid 

( C } STRAIN EDt IES S : dc ub 1 e 
{ D) TOPOLOGY ■ 1 inear 



60 

120 



<xi) SEQUIN ICE DESCRIPTION: SEQ ID NO: 1&: 
GGCACGAC:CT .3CCAACACTG AGGTCTTCGT OGCTTCTCAC ATCTA 3ATGT ATCCCTCTCA 

aatctat:.:t 2tatcca^2c accagattga ggtat:taaa atgtcaactt tccagttact 

r-TTCTTATA CTACCCCAVT OAAXT2TACAA C- AT AA AGTC C AAGCCCCTTC AT A2X ; ACAAA 130 

. , — - ^ -v- - - h 'ptt " ^ tv-t --.-t \( 'TTTAA ACTTT AAAAC 2 4 0 

CCACACCOTG ... . ^ - i. - - — ...... 

:CAGCAG:AC GAAAGTGTCT 2CTAPGCATO TTGCCATATG CGTTCTCTCC ATC ATGCATT 

TGCCTGA3CA AGATCTCTTG AOTiAACATC TTATTCTTT A AGACTCATTG TGGT3GTAGA 

CAGCCTTTAA TAACG3ATC0 TTGG2 2AGGC AC AG TG AC T C A 2 ACCTGT AA TC C CAGAACT 



300 
360 
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CAAGGCTGCA G TG AAC CAT G ATCAGAACAT 
CTTAG3TCAG AAAAAT GAAT AAATAAG2AT 
5 GO A' 2 AC AT C T GTGGTCCCTG CTACTTAGGA 
A3GTCAACAC T AC AG TG AGC TATGATTGTG 
AAAZCCTGCC AAAAAAAAAA AAAAAAA2T 



276 

TGCANTCCAG CTTG>3TAAC AG AGTG AG AC 660 

AAAATTTTAA AAA 2TTAOCC AGGC AT GG T 3 7 2 0 

GGCTGAGGTG AGAGGATCCT TGAGCCCAGG 7FlO 

CCACTAAACT CCAACCTOGG TGAAA^AGCA 84 0 

86 : :> 



(2} INFORMATION FOR SEQ ID N2: 19: 

15 

( l ) 11 EQUENC E C HAPAC TF KI3TICS : 

(A) LENGTH: 9 : : + base pair?. 

(B) TYPE: nucl-iic acid 

( C ) ST RANGEDNE : do ub 1 e 
20 ( G } TG POL' :C Y : linear 

( x 1 ) S EQ r JENC E DESCRI FT ION : SEQ 1 1 J NO : 19: 

ggcc;agccga GATCGTGCCA TTGCACTCCA GCCTGGGCAA CAAGAGTGAA ACTCTGTCTC 60 

25 

AAAAAAAAAA /'-ATT AT AAT A CTATATC-CCA TAAAATGACA TTTCATATTT AAAGAGTTTT i::0 

TTAAAACTCT TGTATTCACA TCXTCATAA'IT TG AAAC ! C C T A TTTCACIGAA TGAGAATGGT 190 

30 ATCTGTTGTC 1 2TC ATTTTTT CATTTTTATC CTTAACAATT TCCACCA2AG CCAGTGCATA 24 0 

TAATGGCAAT GACACCCAGG GATGO^YIGA TAAGTTCCAT CRCN6KTT2AG TCAAGACGCA 3 90 

GACTTGATGT -^GCC-CAA^A ACAGTCAA7A ATGGAGTCTC CAAAATAAAG C T C TAT AGC iA 360 

35 

AAGGTAAATA CCCGCTGCAC AAGAA/^CAC AGCATCTAGG TTCTAACCCC ATCTCTATGA 4 2 0 

AGA'GCTTGCT G3GAGAGTTT TGACATTV;AA CAATCTGTCT GATK3CCAAT TTTYTTCTTC 4*0 

40 TATAAAATGA TAATCTTK3A YTCAAA3ATC CAAAGTCAAT TCATGGTCTA AAACTTAATG 540 

ATTTTTTT AG GTTTTGKGAC ATTTC A 2TGT ACACTGTAGT AATTTATATC TTATTTTCCC 600 

ACTAATTTAG AAAAATATYT AAATGATCCT TAATTG3CAA TGGGTCCTAA GAATTTTGTT 660 

45 

TTAAATCCCT GTTACCCAAA AGAGCCCTTT TTTGTATCTC GCAGTAGTTA CAAGGATCTT 72 0 

T CT AAATC TT .AAAAAAAAAA AAAAAAGAAA GAAAGAAAAG AAAAGAAAAA AAGTCAGCCG 7 80 

50 GGCGTGGTGG CTCATGCCTG TAATC02AG2 ACTTTG3GA 2 CAAGGTGGAC AGATCAC3AG B40 

GTCAGGAGAT GGAGAC CATC CCGGCCAACA TCSAGAAACC CTGTCTCTAC TAAAAAAAAA 900 

AAAAACTCGA GGGGGGCCCG GTACCCAATN C G2CGGCT AG TGGTCGTAAA AC AAT C AAA 959 

55 



(2) INFORMATION FOR SEQ ID NO : 20: 

60 



BNSDOCID vWO 9?-54'^6? A 2 I > 
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10 



50 



60 



111 



(i) SEQUENCE 7RARAGTEFTSTICS : 

(A) LENGTH: 14 46 base pairs 
(3) TYPE: nucleic acid 
{ 7 ) STRAlirEDNEGG . double 
( :"') T2-POLCGY : L mcar 

(xi) GE7G-ENCE DESCRIPTION: SEQ ID NO: 20: 

CGGGGCAG>5 CT3TGTGGCA CCGGGAU>GA GCGGGCCCAO CTGAGTCACT TT ATT GGGTT 

cagtcaaga: tttcttgctc cctgttttgt cttgtgt3gg atgatctcag atgcaggggc i::0 

TGGTTTT03G G' rTTTC'TTGG TTGT3CCAAG :3GC'"P:*GACA0 TGCTGGGG3G CTGGAAAGCC ISO 

15 ccTcocrrcc tgtccttctg to3C7Tc:at cccctcatgg gtgctgccat ccttcctgga 240 

gaga3ggag3 t:av.cc?t, tgtga3ccca giggcttcc: gcccactcac ccaggagctg 300 

GCTGGGCCA3 GAGGl^AGA GG3A3CAC7'G CTGCCCTCCT GGCCCTGCTC CTTGCGCAGT 36 0 

OA 

TAGGGGT-3GA CX^GCCTI'G CTTTC 7 07 AC TOTT7TOGAO C;3AAGGGGAA GGAG7GO3T0 420 

TTGAGGCTOG A3CGAOG7TG ^;,GGTGCTGG GTGGAGAGAT GAGATTTAGG GGGTGCCTCA 4 30 

25 TGG7^GT03GC ACOCC^GGGG TGAAATFAGA AA3G7CCAGA ACGTGCAGGT CTGCGGAGGG 540 

GAAGTGTCCT GAGTGAAGGA GGGGACOCGC ATCCTGGGGG ATGCTGGGAG TGAGTGAGTG 600 

AGATGGCTGA GT3AC-GGTTA TCCGGAXOT GAGGTTTTAT GGGCCTGTGT ATCCCCTTCT 660 

30 

CGC3GCGCOA GCGT07CTCC CTCOT3GC'7G CCTGGCCCAC AGGTCTGCCT CTG3TCCCTG 720 

TCC7TCT7GT OOTT07.GGAT ^TACXTGTCAG C7AAGO*3GTOT AATGGGGCTG GGTTGTGTCT 780 

35 t:tacao:;cc acccg- ;agot 7CTCactogt t<3c:ctc,:3GGA (jccggagggg gctcctgagg 840 

CGTACAG3TT G7GT03GCG: TGCCTGAG3G TCTGGGGTCA 0:3CTTT7,GCT CTGCTGCCTC 90 0 

TCAGTCA7GA AGTGAGCTC2 CTCTOAAAAT CGAGTCCCTT GTTTGGATGT CCTTGTGAGT 960 

40 

CACTCTGQ3C CTOG7TGTOG TCCCTCCTCA GCTTCTTGTT G2TGGGACAA GGGTCAAGCC 102 0 

Ac^ATA.xArc gao:v:otq3G atccc7cacg coa:*;a:o-7 cag.^oog-t ccoctoctx- 1080 

45 tttgc^^gk; ^ :acg:. ;x:a :;a aat v ? a:~cg rrrroGGTOG ccga^twg go ^ \- :to - • 1140 

cgtotggt^tG o37Ctgg7ac toaotgacco ccagctgocc cagccctggt ctotggotca 1260 

rCTOTTGCCT OTTGTGCCGA AGAT7T3CG7 CTGTAGTGCC TTTT3AOGGG TTCGCATCAT 1320 

2GCTCG 7TGA TATTGTATTG AAAATATTAT GCACAGTOTT CATGCTTCTA GTAA70AATA 13 30 
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(2) INFORMATION FOR SEQ ID NO: 21: 

■;x) SEQUENCE CHARACTERISTICS: 
5 { A ) LENGTH: 1471 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS . double 

( D ) TC POLOG Y : "Linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

CAAAAAATAA TAATGATAAT T T A AAAT AAA. TAAGTAACTA AT AAAAAG AT TTTATATCCC 6 0 

AGTCTTATGA TGTT 1 3GTTOG CAAGGCTAGA TAAAAAGATG TTAGAATGAA AG AAC AT ATT 120 

15 

TTT AGTGAT A TG T AAATO AA GGA'TTCTACA ATAGTCATAT ATTTTTATAT GAATGAATGT 180 

TGGGTTGC5GC TGGA3AOSTA TGTGTGTGTA AAT AT AAA! *G TCTCACATTC AGAGTATA3C 24 0 

20 TCTGAAATAA TGOAACTCAT GTCTACAATT CAACATi 5* 2 AT CTGTATAGTT ACATCTCATG 30(1 

TAAATATACA CAGACATATT TTGCA3CCAG T AATT G A- 2 AG TTAATGTCCA AAA C AGGTG A 3 60 

TTGATAGGTA ACAGAAATTA CAT AAC C AC C A\TTTTA2CC AAC AG AAAG A C T AGAAGG AC 42 0 

25 

T AAAAGCAG T TGAATGTATG OTACTGACAT TGTCATAAGC AGTCTGATAA C C AGTTT ATT 480 

GAAACGTGTG C ATT AAC AG A GAATTTAATT TTAAACCCAT AATTTCTCCT ATCCATTAAA 54 0 

30 ATATTATAAT TGTTAGTAGT ATGAAACCAA CAGGAAATGT TTTTT AAT CA TTTAGTGAGG 600 

TGATT:ATTT GTTTCATOOG CAAACACTAT CCA 3GAAAAG CCTTGCTTGC CTGTTTCCCA 660 

AAGAGCT2TA AGAAATAGAA TCAAGTGTAA AATGGTTCAG ACCATTCAGG ATTTCTTGTC 72) 

35 

ACTCTTCTCA ACCCCGATCT TCCTGTTATT ACTGATGTTT lAAACCCTGT CATTAGCCCC 730 

GSCCTSGTTA AAGCCCCTCA GAGTCACCTC TCATTCATAG CAArAGAATT CAACCC 2AAG 34 3 

40 TGGTTGATGG TGTCCCCA3C ACAGCOGAGA GACCTGATCT CTiSGATTCAG T< 3CTTTT AGS 900 

TCTTCGAGTT TACCCTAAGA TACCTTCGGG CAATATTTTT AACCAACCCA AAAG2 TCTTC 960 

AGGTCATTT^ TGAAGAGGAC AAGGTG AAT C TTGGCTTGGA ACACCATTTT T3G3CTCTTG 102 0 

45 

CTACTGAATG AATCAGAAAG GAATTTTTTC TGAAGAGCAT T AG AAAGT AA A3GAGATGTT 108 0 

AAAATAAGTT CTTGAAGTAT GTTTTATATT TATCTAAAAC ACTGATTTTA AAA 3TTT ACA 114 0 

50 TTCAAATGTG TATTCAAAAG AAGTACTGAT TTGTAATTAT TATAGTTTGT GTGTATCATC 1200 

CCCTTTTAAC CGTGCCTAAC AACTGTACTT AAATTTTGTT TTCCTAGTGT AAC AAATGTT 1260 

TCCCATAAGA TTTTCTAGAG CCAAATAATG GGAGTGAAAA ATTCCTTAAG TGTTATATAA 13 20 

55 

GAAAATATAT TAGAAAATCA GCTTTGGATT ATACGATTTC T AAAAT AT AC T AAT AC AG AA 13 80 

TCCTCAGTAA TATGTTTTGA ATTGGATTTT TTCTCAGAAC TGTTACATAA TAAATAATAC 1440 

60 ATCAACCAGA AAAAAAAAAA AAAAAAATTN C 1471 
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5 (2) INFORMATION FOP SEC 1 ID NO: 22: 

( i ; S FOUET :C E CHARAC TER I ST ICS : 

(A) LENGTH: 14 02 base pairs 

( B ) TYPE : nuclei: acid 
10 (C » STRATIDEr.MESS : double 

( D i ' VQ- PO LOG Y : linear 

(x.i) SEQUENCE DESCRIPTION: SEO ID NO: 22: 

15 AGGGACGTCT TGOCTGACJGA C SATO rCCATT TCTGTCCTGG RTTACCCTCA CTGCGTGGTG 

CATGASCTGC CAGAGGTGAC QOO^GAGAGT TTGCAAGCAG GTGACAGTAA CCAATTTTGC 120 

TGGAGGAACC TCTTTTCTTG TAT 7AATCTG CTTC'GGATCT T3AACAAGCT >3ACAAAGTGG 180 

on 

aagcattcaa g:;acaatgat gct:;gtsgtg ttcaagtcag cscccatctt gaagcgggcc 24 o 

CTAAAGGTGA AACAAGCCAT GATGGAGCTC TATGTSCTGA AGCTGCTCAA -SGTACAGAC^ 300 

25 AAATACTT 1 3G GOOSGCAGTG ^OCGAAAGAGC AACATGAAGA CCATGTCTGC CATCTACCAG 360 

AAGGTGCOGC ATOSGCTGAA GGAC3ACTGG GC ATACG3C A ATGATCTTGA TGCCCGGCCT 42 0 

TGGGACTTCC ACySCAGAOGA GTGTCOCCTT CGTGCCAACA TTGAAC3CTT CAAOSCCCGG 490 

30 

OGCTATGACC GOSCCGArAG CAACCCTGAC TTCCTGCCAG TGGACAACTG CCTGCA3AGT 54 0 

GTCCTOGGC3 AA:GGGT0GA CCTCCCTGAG G AC TTTC AG A TGAACTATGA 2CTCTGGTTA 600 

35 GAAAGGGAOG TCTTCTC2AA 2CCCATTTCC TQ3GAAGAGC 1X3CT SCAGTG ASOSTGTTGG 60m 

TTAG3GGACT GAAAT>3AGA 0AAAAGATGA TCTGAAGGTA CCTGTGG3AC T3TCCTAGTT 720 

CATTGCTGCA GTGCTCCCAT 2CC 2CACCAG GTQ3CAGCAC AGCC2CA2TG TGTCTTC03C 7 30 

40 

AGTCTG TC C T -G2GCTTGGG T GAGCCCACCT TGACCTCCCC TTGGTTCCCA GGGTCCTOCT 84 0 

CCGAAGCAOT CATCT2TOCC TGAGATCCAT TCTTCCTTTA riTTCCCCCAM CCTCCTCT 2T 900 

45 TGGATATGGT TO^TTTTGO~C TCArCTSACA ATCAGCCCAA GC.YTGGGAAA GOT': VS AAT SG 9i*0 

ATACAAATCC TTGCCTTT<1T CACWTTTGCAA AGG AOS AG AG TTTAGGATTA GGGCCAGGGC 10MO 
CAGAAAGTCG GTATCTTGGT TGTGCTCTGG GSTGG'OGGTG GGGTGTTTCT GATGTTATTC 
CAGCCTCCTG CTACATTATA TOCASAAGTA ATTGCGGAGG CTCCTTCAGC TGCCTCAGCA 



114 0 



to 
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\TAAAAAAAA AAAAAAAAAA OT 14 0:; 



5 

(2) I NFORMAT I ON FOR SEQ ID NO: 23: 

(i) SEQCKN7E CHARACTERISTICS: 

(A) LENGTH: 1047 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
( D ) TO PGLCGY : 1 1 near 

(xi) SENTENCE DESCRIPTION: SEQ ID NO: 23: 

15 

GGCACAGGGG ACrACAGGCA CCCACGACCA TACCCAGCTA ATTTTTGTAT TrTTTTGTAG 60 

AGATGGGGTT TCACGATGTC GrCCAOGOLG GTCTTGAACT CCTGGGCTTG AGCGATCTTC 120 

20 CCATCTTTCC ATCTT<JGCOT CCTAAAGTGC TGGGACTGCA * iOSATG AGCC ACCATGCCCA 18 C 

GCCAAGATTC TTATTGATTA 2 | 2ATGTTOOT TCAAGAAGCC AAGCCAGTTT CCAATATTCC 24 0 

CCATTTGCTG GAGTCTTO ?T ACTTTSGGTA GAAGCAACTG GTAAATTGTT AATTGGAACA 3 00 

25 

NTTGGTGGTG TA3ATAACCA CGTATGGC 2 A AA2 2TAGAG2 ATCTAGGCTC AOAATTACTA 3 60 

TCCTGACTT S ATAACAAGTG TTCTGATATT AAC CTG AAAA TGGGAATAAT GCCAAATCTG 42 0 

30 TGTAACTTAA CATCTATATA C AC AG TGGGG AGAACTGAAG TTATTAAACC T3GAATCTCT 480 

GTGATCAAGG CTAACA3TAG TTATCTAAGA AGCAAAGGAC CTACAATTCT TAGACTTGGA 5 40 

GTCATATTCT TTAAGGAC jT GTTCTGAAAC TATATCAAGC ATCTGGTTTC CACGTATTTC 60 0 

35 

TCCCTCAGAA ATTATGAAGT ACAAGTAAAA ATGAAG3TAC AGGGTAAGAC ACATGCTGCT 6 6 0 

TTCTTGCTCT TGAGTGGAGA CAG7TTTCCA GCCATCTTAA 02CCTTOACA CAAAACAATT 72 0 

40 TGTGTTTT AT AGC AAAT AAG TGACTCAACA TAATTTCAAT ATGATGTTTA TCCACCAGTA 7 90 

CTTTCCTTTC AGCTTCTAGT CCCATAARTG GTTTGTGAAG TCATCGGTTA CATTAGCCAA 840 

GATAGGCCTA GACTTGAAGT C TAGAATGTT TTTCCCACTA TATGCCAAAG TAGAATGTGG 900 

45 

GTATCTCAGG GTCATTTTTG TTGTTCAATT TCCCACCTGT ACAGTTGTTA TGATTCACTT 960 

TC CTT ATGTG TCTAATAAAT CTTGTTCCAT GAAATGATCA AAAAAAAAAA AAAAAAAACT 102 0 

50 CGAGGGGGGG CCCGGTACCC AAATCGC 1047 



55 {2} INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 990 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRANDEDNE5S : double 
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(0) TOPOLOGY: linear 
{ X l ) 2 EQUENC E D E CO E I PT I ON : S EQ ID NO: 24: 

TTG3AAAGGG TCTAGCTCTT TOY C ATTC AC CAACTATATT AGAAGCACTT GAGGGAAATT 60 

TACCACTCCA AATCCAAAGC AATGAACAGT CTTTTCTGGA TGATITTATT GCCTGTGTCC 120 

C AG 3 ATGAAG TGGTGGAAG3 CTT3CAA3GT G3CTTCAGCC AG ATTC AT AT GCGGATCCTC ISO 

AG AAAA' OA' PC TTTGATCCTG GAATAAGGAT GATATTCGTT GTGGTTG3CC TACCACCATA 240 

ACT 3TTGAAA CAAAAGACOA GTATGGG3AT GTGGTACATG TTCCCAATAT GAAGGTAATT 300 

AT AAC TGX3 AT TAAATTAGJA GACATCTATA TACTOGCTGC AATG ACT' ? AT AAAATT TT AG 3 fi 0 

AAATGC3AAG TGCTGAGRGT CCATTTGTTC TACCCTCTTT ATATAAAGGG TG ATGCTGAA 4 2 0 

AGTTTGTTTA AATGACTT3T TTATATTAAT TAGTCCCCAA GTGTCCAAGT TACACCTGTT 4 80 

ITTTTTGT OA G TTTG TT 2 T T T A 2 ATTTTGC T AC CTG TT AC GCGG A' 2TC AA AGG AG 3G AT A 5 4 C > 

AGAAAGTATC CATCTAAAGA 3T-3CTAGACA CAT A C AGTG A AGCC02TCAA TATGTATTGA 600 

TTGAATAAAT GCATGAAAGA ATACATTTTT AAATTTTGTG TATAGTTTTG AAAGACTCAA 660 

GTACGTTCTG TCTTTGGTAT TAT TG AA AC C AC ATTTT AAA AATAACACTC ATT AA 3TTAG 720 

AAATATATGA GTTTAGATTG T AAAAG AATG AGGAATTGAA ATAG TTGT AT AC C AT ATTG A 7 80 

TGAATATAG A GTTTTTAGGA TACCTCTTAC CTGAAATATT AATAATAATG TTTNCAGAGC 84 0 

ATATTATACA TAATTATTTG TGATTTAATC TGTTAATATG AATATCTCAT TTAAAACTTT 900 

TATTTCTGAA AAAATT AT AT TGAATAAAAT TTTATATAGG CAGTCCCCAG CC CTTTCCTC 960 

CTTCAAAGTT GTCTTATAGA GTGATT3GTT 990 



(2) INFORMATION FOP. SEC ID NO: 2^: 

{ i ) 5 E QUTCNC E CH ARACT ER I ST ICS: 

( A ) 0 EN» jTH : 11' : 'i b^se pair c 
{0) TYPE: nucleic acid 

(C) STRAICOEDrJEES : risible 

( D ) TOPOIOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2S. 
T AATCGCT AC TATAGGGAAA GCTGGTCGCT GCAGGTACCG GTCCGGAAIT CCGGGTCGAC 60 
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CTCCTC3GCC CCTGAGCTCC AG3CCGT3CG CATGTTTGCT GAOTACCTCG C C C AC GAG AG 3 60 

TCGGAG3GAC A3CATCGT3G CCGAGCTGGA CCGA3AGATG AGCA.G 5AGCK T33ACGTOAC 42 0 

5 

CAACACCACC TTCCTGCTCA TG3CCGCCTC CATCTATCTC CA.":GA 2'CAGA A2C03GATGC 43 0 

C.5CCCTGCGT GCG2T3CAC3 AGG33GACAG '7CTG3AGTGC ACAGCCAT3A CAGT02AGAT 54 0 

10 CCTGCTGAA3 CT-3GACCGCC TGGACCTCGC CCGGAAGGAG CT3AAGAGAA TG2AG3ACCT 600 

GGACGA3GAT GCCAC3CTCA CCCAGCTCGC CACTGCTTGG GTCA3CCT3G CCAC 33GTGG b* : -0 

TGAGAAGCTG CAGGAT3CCT ACTACATCTT CCAG3A3ATG '3CTGACAAGT GCTC3CCCAC 73 0 

15 

C2T3CTGCTG CTC AAT '303C AG3CGGCCTG CCACATGGCC CA 3GGCCGCT & 33 A- 3* 3CCGC 7 'HO 
T3A3GGCCT3 CTGCAGGA 3G CCCTAGACAA G3ATAGT ( 3G-7 TA7CCRGAGA C3CT3GTCAA 8 4 0 

20 CCTCATCGTC CTGTCCCA3C ACCTKGGCAA -3CCCCCTGAG GTGACAAACC GATACCTGTC 900 
CCA3CT3AAG GATGCCCACA GGTCCCATCC CTTCATCAAG GAGTACCAGG CCAAGGAGAA 96 0 

CGACTTTGAC AG3CTGGTGC TACAGTAC3C TCCCAGCGCT GAG3CTGGCC CAG A GCTG TC 102 0 

25 

AGGACCATGA AGCCAGGACA GAGGCCAG3A GCCAGCCCTG CA3CCCTCCC CACCCGGCAT 1030 

CCACCT3CAT CCCTCTG3GG CAGGAGCCCA CCCCCA3CAC CCCCATCTGT TAATAAATAT 1140 

30 CTCAACTCCA RG3TGTTCCA C C TGAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1200 



AAAAAAAA 



35 

(2) IIJFOFMATIOH FOF SEQ ZD NO: 26: 

U ) S EQUEI IC E CHARACTER 1ST ICS : 
40 (A) LENGTH: 1922 base pairs 

(3) TYPE: nucleic acid 

(C) STRANDEDNTESS : double 

(D) TOPOLOGY: linear 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID DO: 26: 

GTGCTGCGCT ACTGAGCAGC G2CATGGAGG AC TCTGAAGC ACTGGGCTTC 1 3AACAC ATGG 60 

GCTTC3ATCC CCG3CTCCTT CAGGCTGTCA CCGATCTGGG CTGGTCGCGA CCTACGCTGA 120 

50 

TCCA3GAGAA G3CCATCCCA CTGGCCCTAG AA 3GG AAGG A CCTCCTG3CT CG303CCGCA 130 

CGGGCTCCGG GAAGACGGCC GCTTATGCTA TTCCGATGCT '3CAGCTGTTG CT07ATAG3A 2 40 

55 AGGCGACAGG TCCG3TGGTA GAACAGGCAG TGAGA3GCCT TGTTCTTGTT CCTACCAAGG 300 

AGCT03CA CG GCAA3C AC AG TCCATGATTC AGCAGCTGGC TACCTACTGT GCTCGGGATG 360.- - 

TCCGAGTGGC C AATGTCTC A GCTGCTGAAG ACTCAGTCTC TCAGAGAGCT GTGCTGATGG 42 0 

60 
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AGAAGCCAGA TGTGGTAGTA GGGACCCCAT CTCGCATATT AAGCCACTTG CAGCAAGACA 4 3 0 

GCCTGAAACT TCGTGACTCC C TGGAGCT TT TGGTCGTGGA CGAAGCTOAC CTTCTTTTTT 540 

CCTTT03CTT TGAAGAAGAG C TC AAG AG T i 7 TCCTCTGTCA CTTGGCCCGG ATTT AC ( GAOL; 60 0 

CTTTTCTCAT GTCAOCTACT TTTAACGAG5 ACGTACAAOC ACTCAAGGA7, CTGATATTAG 600 

ATAACC 0 GGT TACG 7TTAAG TTACAGGAGT CCCTAGCTOGC TCGCtCCAGAG CAGTTACAG7 '7 2 0 

AGTTTCAGGT GGTGTGTGAG ACTGAGGAAG ACAAATTC 7T CCTG7TGTAP GCCCTG7T'2A 780 
AOCTGTCATT GATTCGGGGC AAGTCTCTGC TCTTTGTC -VA CACTCTAGAA CGGAGTTACC 840 
GGCTACG7CT GTTC TP GGAA CAGTTCAGCA TCCCCACCTG TGTGOTGAAT GGAGAGCTT7 90' > 

caotgcgttg gaggtgccac atgatctca7 at/ttcaagga aggcttgtag gactgtgto\ c> :« 

TAGCAACTGA TGCTGAAGT7 CTG3GGGCC 2 CAGTCAAG^GG CAAG7G FCGG GGCCGAGGG7 10'J 

CMAGVAGGGGA CAAGGCCTCT GATCCG2AAG CACGTGTOGC COGGOGOATA GACTTCGA7C 10H-» 

ATGTGT 7TGC TGTOCTCAAC TTTGATCTTC CCCCAACCCC TGACrGCOTAC AT 7CATCGAG 114" 

CTGGCAGGAC AGCACGCGCT AACAA7 7 7 AG G7ATAGTCTT AAC 3TTTGTG 2TTCCCAC7G 1200 

AG2AGTTCCA CTTAGGCAAG ATT GAG 7, AG 7 TT' 7T 2 AGT GG A GAG AAC AGG GGGGCGATTG 1261 

TGCTCC GGTA CCAGTTCCGG ATGGAGGAGA TCGAGGG7TT CCGCrAT7GC TG7AGGGATG 1.12'.) 

CCAT'GCGGTC AGTGACTAAG CAGGGGATTG GGGAG3GAAG ATTGAAGGAG AT- .7 AA GGAA G 1380 

AG2TTCTGCA TTCTGAGAAG CTTAAGACAT ACTTTGAAGA CAACCCTAGG GACCTCCAG7 1440 

T G7T-GCGG7A TGACCTACCT TTGCAGCCCG CAGTGGTGAA GCCCCACCTG G0CCATGTT7 150') 

CTGACTACCT GCTTCCTCCT GCTGTGGGTG GCCTGGTFGG CCCTCACAAG AAGCG7AAGA 15 50 

AG2TGTCTTC CTCTTGTAGG AAGGCCAAGA G AGC AAAG T G C C AG AAC C C A CT 3CGCAGCT 15 2 J 

TCAA'GCACAA AGG AAAG AAA TTCAGACCGA CAC-CCAAOGC CTGCTGAGGT TGTTGGG7CT 156 3 

CT2TGGAGCT GAGCACATTG TGGAGCACAG \ GTTA 7AC 7C TT7GTGGACA G'CGAGOCTG 1740 

TGGTGCTTAG T "A "A V 7T GAACAGACAG 7T GTOOGOCC GG7AGTG77G , •„ . • v ?:T; ; 1 0 

('*T:~ r 'T rr, GO'~ *» ^TT' .\ CG«^ A 7 * CTTGC 77 ^TTGA.GA A. CAGAATAAAA A^ TTTVGC7G 13 6 3 

CCCCAAAAAA AAAAAAAAAA AAAAAAACTC GAGGGGGGGC C7GTAC 7CAA TTCGCCCTAT 197 0 
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( C ) 3 TRANDEE MES S : do lib 1 e 
(CO TOPOLOGY: linear 

(xi) SEQUEiJGE DESCRIPTION: SEQ ID NO: 27: 

5 

TGGTGC0 3A3 AGCOOGCTGA GCCCOAGG' :< 3 SAGGGTG>3G GGGGAGCCTG CrGGGA'.*OGGC 60 

CGGGAC3TC3 ACG3GGGTCT CTGA3CTCG3 ACACCAGOG: COT 3TGCTAT ga;tctgtca :.:(■ 

10 AGTACACGCT G3T3GTAGAT GAGCATGCAC AGCTGGA' 3CT G3TGA3CCTG C<3CCGTQ3TT 130 

GG7AGACTAG AGTCACGA-3A GTGACTCTy3 CACCGTCTAT GACAACTGTG C ITCCOTCTG 24c 

ctcgcc^tat gagiggg::a tcgca ;ag3a A3\atgag3ag g~:c30G3qg3 c:3ag3ccc3 30c 

15 

T3GCTGGCTC TGC 3AG3AAC TCCAC 7GCTG ATGAACCCGA GG 7C ;ATTTC TC 3AAGAAAT 36 ") 

T 3GTGAACGT YTTCAT 3AGT GGCG GGTCG 3 O3TG0TC3A3 TGCTGAGTCC TTG irGJ^TGT 423 

20 TZTCCTGCAT CATOAA 3G:G GAG3AG3A3>3 A(.iCA3AC:GA 33GG3GCATA IT 3 A 3- j ITT3 4f*0 

TG3CTCGACA GGAAGAOGAA CTTGAG3T3CJ AAGT3GATGA C3CT3T3CTA GrG:GV3CTCC 5 4 .) 

AG3CTOAAGA CTACTGGTAC GAGGCGTAGA A' 3 AT 3CG3 A C TGGTG03CGG CCTOTCTTTG 60 3 

25 

CT3G ETATI "A C GG AT : 3 A 3 GTC AGO AAOG A 3CC 3GAG3 A G AT GG3AGCC C T 34-: : 0 3 AAAA 6 6 0 

ACAGTCACTG GGT3GA3 CAG TTCCGGG7CA A 3TTGCTGGG CTGA3T3CAG GTTC3CTATC 720 

30 ACAAGGGGAA TGAGOTCCTC TGTGGTGCTA TGCAAAAGAT ICGCACCA3G GGCOG3CTCA 780 

CCGTG3ACTT TAACCCGCCC TCCACCTGTG T3CTGGAGAT ^ AGCGT3CGG GGTCH GAAGA 840 

TAC/vr;TCAA GGO"GAT3AC TCCCA.GGAOG CGAAGGC^GAA TAAATGTAGC CACT^TTTCC 9C0 

35 

AC iTT AAAAAA CATOTOITTG TGCGGATATC ATCCAAAGAA CAAj3AA<GTAG 7 . 7' i XX3TTOA 3 6 3 

TCACGAAGCA G0:33GCGGA3 CAC03CT77'G CCI'CCCAOOT GTT3GTGTGT GAACAG7X:CA 1020 

40 CCAAAGCCGT '3GGA3AG7GG GTOTGCAGAG CATTCCAGCA GTTCTACAAG < :AOTTTGTGG 108 0 

AGTACACCTG GC ' 3CAG AC j AA GATATCTAGG TGGAGTAGGT GTGCAGCCCC GCCCTCTGCG 114 0 

rp CCCC i 7A&2C GT^A'^JCAG TGCGAGGAOA GOTGGCTGGT GA3AGGATGT GG33\GTGCTT 1200 

45 

GAG3AG3G-3C AC3730"2ACG GGGAGAGGA3 AAGGAAGTGG 0030*33X2^303 GA:^G3TAG3G 12£0 

GA3GGT3G3G GAATG03GAG AGG-AAAT-JC AGTTTATTGT AATATATGGG ATTAOATTGA 13 2 0 

50 TCTAT3GA 3G GCAGAGT-yjG GTO20TG3GG ATTGGGAGGG ACAG3GCTTG G3GAGGAG3T 1380 

GTGTGGCAGA GAAG3AT3TC CGTTGGA3GA G3A3ACG3CC CTGOOCCATG CTGGGCGTTA 1440 

G^TCCCCTGC GAGGG3TPGG GCGCTGT3GC TGGTGGCTTG ATGAAGC 2GG T3TCCT3CCT 1500 

55 

TGATGAAGCC TGTGCCACCT GCAAGT032C GCG3TGCCCC TGCCCCAACC CCGACCGAAG 1560 

A-SCCCTGAGC TCAGGGTGAG CCCAGCCACC TGG'3AAGGAC TTTGCAGTGA G3AAATGGGA 1620 

60 ACACGTGGAG GTGAAGTCCC TGTTG TCAGG T3CGTCATCT GCG33GCTTC TG3GTGGCTC 1680 
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CTGCCACTGA CCTCACCGGC AT:>2T007 2T GTGGCAGCCC TAGGACCTCA ggcgg:,gagg ; • 4 .. 

AGGAvGCTGCC OCAAG3CCCT GTCCCAG2AG AA.GA3GGAG2, CTTCCTGACT GACACAGGCC 1 B C' t • 

5 

A&:CCCATCT TOjTCCTGTC AC:CTG2C2C CAACTATTAA AGTOCCATTT CCTGTCAAAA LSfu 

AAAAAAAAAA AVYATO 3G 3G G3G2CCCG3A ANCCAATTTC CCCCAAAAAG GOGCGTTATA 192':.' 
10 AAVVTTCOCN G3CNGTGTTT TTAAAAATTC G 



15 (2) r INFORMATION FOR SEC 12 NO: 23: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9&9 base pairs 
(E) TYPE: nucleic acid 
20 ( C ' ) RTR AMT F T NF C R : d o 1 1 R 1 R 

(!'■) TOPOLOGY : 1 inear 

(Xl) SEQUENCE EESCFIPTION: SEC ID N2: 2S: 

25 og3aca20oc G2ag3gna::: tatg3o:gca tataggttgt aatgaaactg tagtctcagt eo 

TG3AAG2CTA GACATGA-WT =3G3TCAGT3A (XTAAGGCTCT ATTCCTAGTC T02AGOIATG 1/it 

CCTGTGjAAC CT 2 ARCCCRC TCTCAGCACA TTG3AC2CAG G2AGATGYAA AAAATTCACA IS'' 1 

30 

GAACTATGAT TTGGACTOAA G2GTTTGTAG ATTTCCTCCT TCATTCTAAT TTCAGTGTCT 2 4 0 

AAAATT 2TT ! 2 CATCCRTGAA : 2AGOTGCG2 ATTTGATGA 1 2 A2AG2GCYGA ATACT02AGT 3 0 o t 

35 TTTC2TC2TA GAAATCAT:;T G2G3CATTTT CTTTGAACTj ATG3GAACAA TAAGGCATAA 3(,it 

CTGTTT ^ : A _ AAVJTTG } 3 A T ^ AR T< 3 ATTT TGG 2 ATAA RG ATCT AC CA' 3 A ATGGG2 ATAT 4 2') 

TT2A2C 2TT2 2TT2IGAGAT GO AAAC Z AAA GAATATCAT Z A2CAG:TTTC AG2CCTCCTG 48 u 

40 

AAGTATATCI CTCACATTGT CCTGTTCTCA TGCTGAGGAG C2TGAGAT2C CTGTGTGG2G 54.1 

atta^acagt ggactgttat 203totargt 2;ayitg:;ctt attttgtctg tcgct: TCTG SO" 1 

45 AATGrArr ;c ag3aaytaaa aagraceaa^ aa ;ag ;a-\ga aga 7 : \ac-gr craoc^rgcc 66 1 

atagatgtta ttcaact-tct tc:agttgt2 ttgaacagcc tga rtcctgc cagccctatg 739 

50 

G AAG TT C C TT TTATGCATTG GAG3AAAAAC atgttcgctt ttctcttgac GTCGGACAAA 84 ) 

TTGAAA AG AA GG 2GAAGG 2G AAG AAAA. J AA GO GGAAG AAG AT '2 AGG-. CAAG GAAAC "i / AG A" . 9 0 ) 



60 
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GGAGCAACAG CATGTTGGCT TGGCTGTTGA C ATGG ATG AA ATTGAAAAGT ACCAAGAAGT 11-10 

ggaagaagac caawoccat catgccccag gctcacjcagg gagotgctgg ATGAGAAAGA I20U 

5 GCCTGAAGTC TTG« IAGGACT CACTGGATAG ATGTTATTOG ACTGCTTC P\G GTTATCTT3A 1260 

actgcctcac ttasgcgagc cotacagcag tgckgtttac tcattg3A.gg amcaktacit i:io 

T30CTTKKCT CTTGACGTGG ASAAATTGAA AAGAAG3GGA AC5G-3GAAF.AA AA 3AAGG3GA 13f<0 

10 

AGAAGATGAA AGAAGGAAAG AAG AAGO 3G A AG AAAA 3AA 1 j GGGAAGAAGA TCAAAACCCA 14 V) 

C3ATGCCCCA GGGTOAGCAG GGAGGTG3TG GATGAGAAAG GGC2TGAAGT CTTGCAGGAC 1000 

15 T C AC TOG AT A GATGTTATTC AACTCCTTCA GGTTGT3TTG AAG TG ACT 3 A CTCATGCCAG 1060 

C C C T AC AG AA GTOCCTTTTA YHTATTG3AG CAACAGYGTG TT GGCTTC > 3 0 TGTIGACATG 1020 

C-ATGAAATTG AAA A 3 T AG C A AG AAGTG 3 AA GAAGACCAAG AC 2CATCA IG CCC3AGGCTC l-jr-0 

20 

AG3AGGGAGC TCCT" 3G AT3 A GAAAGAG3CT GAAGTCTTG2 AG-3ACTCAGT GGATA3ATGT 1" 7 40 

CATTCGACTC CTTCAGGTTA TCTTGAACTG CCTGACTTAG G3GAGCCGTA CAGCAGTGCT 13 30 
25 CTTTACTCAT TCGACGAACA GTACCTT3GC TTGGCTCTT3 AC3TGGACAG AATTAAAAA 3 

GACCAAGAA 3 AGOAA.GAAGA CCAAGGCCCA CCATGCCCCA GGCTCA3CAG GGAGCTGCTG 1.320 

«;AGGTAGTAG J^C^CTGAAGT CTTGCAG3AC TCACTGGATA GATGTTATTC AACTCCTTCC 19c 0 

30 

AGTTGTCTTG AA( : AGCCTGA CTCCTGCCAG CCCTATGGAA GTTCCTTTTA TGC ATTGGAG 2 04 0 

GAAAAACATG TTOG^TTTTC TCTTGACGTG O 3 AGAAATTG AAAAGAA<3GG GAAGG3GAAG 2 Km 

35 AAAAGAAGC-G G AAG AAGATC AAMGAAGRAA AG AAG AAGCVG GAAGAAAA.CLA ACK-3GAAGAA 2 10 0 

1 3 ATC AAAAC C CAGCATf^CCC CAG3CTCAAC GGCGTCJCTGA TGGAAGTOGA AGAGCSTGAA 21:20 

GTCTTACAGG ACTCACTGGA TAGATGTTAT TC G ACT C C G T C AATG T AC TT TGAACTACCT 2 2r0 

40 

GACTCATTC C AG3ACTACAG AAGTGTGTTT TACTCATTTG AGGAACAGCA CATCAGCTTC 2 340 

<3CCCTTTAC3 T< 3G AC AAT AG GTTTTTTACT TTGACGGTGA CAAGTCTCCA CCTGGTGTTC 2 400 

45 C AG ATGGG A 3 TCATATTCCC ACAATAAGCA GCCCTTASTA AKCCGAGAC^A TG T C ATTC C' V 2 4 60 

GCAGGCAGGA CC1 ATA3GCA MGTGAAGATT TGAATGAAAG TACAGTTCCA TTTGGAAGOC 2 52 0 

CAGACATA'GG AT03GTCAGT GGGCATGGCT CT ATTC CT AT TCTCAAACCA TO 3 C AGT 30 C 2 580 

50 

AACCTGTGCT CAGTCTGAAG ACAATGGACC CACGTTAGGT GTGACACGTT CA'3 AT AACTG 264 0 

TGCAGCACAT '3C03GGAGTG ATCAGTCRGA < 3ATTTT AATT TGAACCACGT ATCTCTGGGT 2700 

55 Ai 3C T ACAAAA TTCCTCAGGG ATTTCATTTT '3CAG3CAT3T CTCTGAG3TT CTATACCT-3C 276 0 
TCAAGGTCAK TGTCATCTTT GTGTTTAGCT CATCCAAA3G TGTTACCCTG GTTTCAATGA - 282±)- 

ACCTAACCTC ATTCTTTGTG TCTTCAGTGT TGGCTTGTTT TAGCTGATGC ATCTGTAACA 2 880 

60 



WO 98/54963 



PCT/US98/11422 



287 

CAGGAGGGAT CCTTGGCTGA GGATTGTATT TCAGAACCAC CAACTGCTCT TGACAATTGT 2 94 0 

' I 1 AAj 2 CC GC T A GECTCCTTTG GTTAGAGAAG CCACAGTCCT TCAGCCTCCA ATTGGTGTCA 3 000 

5 GTACTTAGGA AGACCACAGG T AG AT OG AC A AACAG2ATTG G3AGG2CTTA OCCCTGCTCC 3 060 

T GT 2 RATTC C AT C CTG T AG A G AACAGGAG T C AGG AGS C GG T 1 3GC A GG AG A GAG 2 AT GT : 2 A 2 1 2 0 

CCCA'GGACTC TGCCG3TG2A G AAT ATGAAC AAYG2GATGT TCTTGCAGAA AACGCTTA3C 3180 

10 

CTG A GTTTCA TAG3A2GTAA TCACCAGACA ACTG2AGAAT GTRGARCA2T GAGCAGGACA 3 24 0 

G2TGACCTGT CTCCTT2A2A TAGTCCATRT CACCACAAAT C Ai 2 AC AAC AA AAA OGAGAEG 3 3 00 

15 AGA T ATTTTG GGTT' 2 AAAAA AAG T AAAAAG ATAATGTAGC TO2ATTTCTT TAGTTATTTT 3 3 60 

GAECCCCAAA TATTTCCTCA TCTTTTTGTT GTTGTCATKG ATGGTGGTGA CATGGACTTO 3 4 20 

TTTATAGAGC.; ACAG3TCAGC TGTGTGG2TG AGTGATCTAC ATTC TG AAG ' V TGTCTGAAAA 3 4 30 

20 

TGTCTTCATG ATTAAATTCA O2CTAAA20T TTTGCCGGGA AGACT'GGAGA GACAATOCTO 3S40 

TGAGTTTC C A ACCTYAGCCC ATCTG2GGGC AGAGAAGGTG TAGTTTGTCG ATCASCATTA 3 600 

25 TCATGATATG AG3ACTG3TT ACTTGGTTAA G 3 A'GGGGTCT AGG AG AT C TG TCCCTTTTAG 3660 

AG AC AC CTT A CTTATAATGA AGTATTTGGG A2GGTGGTTT TCAAAATTAG AAATGTCCTG 37 21) 

TATTOCRATO ATCATOCTGT AAACATTTTA T 2 ATTTATTA ATCATCCCTG CCTGTGTCTA 37 30 

30 

TT ATTATATT CATATCTCTA C'GG TGG AA A C TTTCTGCCTC AATGTTTACT GTGCCTTTGT 3 84 } 

TTTTGCTAGT GTGTGTTGTT GAAAAAAAAA ACATTCTCTG GCTGAGTTTT AATTTTTGTC 3 900 

35 caaagttatt ttaatctata c aatt aaaag cttttg3cta tcaaaaa2g\a a.vj\aaa/^ 3 960 
;j\aaaaaaaa aaaaa<:;ogga cgcgtgggo 3 589 

40 

(2) INFORMATION FOP SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 37 3 6 base paii^ 

(E) TYPE: nucleic ac: 1 
02) STEANOEENECS : 3a' aG^ 
( D ) TG PCLOGY : linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

CTOCTGTTCG CTOG2TGGGC TCCGCAOCAG GCTTGOCCAG CGGGTGACGG GTCGGCG3G2 6C 
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?88 



ATTTTTACTG GCAAAGAAAT CCGGGGAGAA TGTOGCCAAG TTTATTATTA ATTC AT AC CC 3 6'" 

CAAATATTTT C AG AAG3AC A TAGCTGAACC TCATATACCG TGTTTAATGC CTGAGTACTT 4 3v 

5 

TGAACCTCAG ATCAAAGACA TAA3TGAAGC CGCCCTGAAG GAACGAATTG AGCTCAGAAA 4 8 0 

AGTCAAAGC3 TCTGT3GACA TGTTTGATCA GrTTTTGCAA G3AGGAACCA CTGTGTCTCT 540 

10 T-3AAACAACA AATAGT7T3T TGGATTTWTT GTGTTACTAT G3TGACCAGG AGCCCTCAAC 60 1 '- 

TG ATT AC CAT TTTCAACAAA CT3GA 2 AGTC A3AAGCATTG GAAGAGGAAA ATGATGAGAC 66 0 

ATCTAGGAG3 AAAG3T3GTC ATCAGTTTGG AGTTACATGG CGAGCAAAAA ACAACGCTGA 7::*.: 

15 

GAGAATCTTT TCTCTAATGC C AG AG AAAAA TGAACATTCC T ATTOC AC AA TGAT3CGAGG 7 ft:"! 

AATGGTGAAG CACCGAGCTT AT3AGCAGGC ATTAAACTT3 T AC ACT 'GAGT TACTAAACAA 34 0 

20 CAGACTCCAT GCTGATGTAT AC^CATTTAA TGCATTGATT GAAGCAACA 3 TATGT'GCGAT 90-") 

A AAT GAG AAA TTTGAG3AAA AATGGAGTAA AATACTGGAG CTGCTAAGA 3 AC ATGG TT GC 96 0 

ACAGAAGGTG AAACCAAATC TTCAGACTTT T AAT ACCATT CTGAAATGTC TCCGAAGATT 102 0 

25 

TCATGTGTTT OCAAGATCGC C AGCCTT AC A GGTTTTACGT GAAATGAAAG CCATTGGAAT 1030 

AGAACCCTCG CTTGCAACAT ATC AC C AT AT TATTCGCCTG TTTGATCAAC CTGGAG AC CC 114 0 

30 TTTAAAGAGA TCATCCTTCA TCATTTATGA TATAATGAAT I 3AATTAATGG GAAAGAGATT 1200 

TTCTCCAAAG GAC 3CGGATG ATGATAAGTT TTTTCAGTCA GCCATGAGCA TATGCTCATC 12 60 

T C T C AG AG AT CTAGAACTTG C CT ACCAAGT AC ATGGCC TT TTAAAAACCG G AG AC AAC TG 13 2 0 

35 

GAAATTCATT GGACCTGATC AACATCGTAA TTTCTATTAT TCCAAGTTCT TCGATTTGAT 13 PC 

TTGTCTAATG GAACAAATTG ATGTTACCTT GAAGTGGTAT GAGGACCTGA TACCTTCAGC 14 4 3 

40 CTACTTTCCC CACTCCCAAA CAATGATACA TCTTCTCCAA GCATTGGATG TGGCCAATCG 150 0 

GCTAGAAGTG ATTCCTAAAA TTTG3AAAGA TAGTAAAGAA T ATGG TCAT A CTTTCCGCAG 156 0 

TGACCTGAGA ( lAAGAGATCC TGATGCTCAT GGCAAGGGAC AAGCACCCAC C AG AGCTT CA 162 0 

45 

GGTGGCATTT GCTGACTGTG CTGCTGATAT CAAATCTGCG TATGAAAGCC AAC CC ATC AG 16 30 

ACAGACTGCT C AGGATTGGC CAGCCACCTC TCTCAACTGT ATAGCTATCC TCTTTTTAAG 174 0 

50 GGCTGGGAGA ACTCAGGAAG CCTGGAAAAT GTTGGGGCTT TTCAGGAAGC ATAATAAGAT IB 0 0 

TCCTAGAA3T GAGTT'3CTGA ATGA3CTTAT GGACAGTGCA AAAGTGTC T A ACAGCCCTTC I860 

C CAGGCCATT GAAGTAGTAG AGCT3GCAAG TGC CTTCAGC TTACCTATTT GTGAGGGCCT 1920 

55 

1 3 AC C CAGAG A GTAATGAGTG ATTTTGCAAT CAACCAGGAA CAAAAGGAAG CCCTAAGTAA 19^0 

TC T AACTG 3 A TTGACCAGTG ACAGTGATAC TGACAGCAGC AGTGACAGCG AC A 3TG AC AC 2040 

60 CAGTGAAG3C AAATGAAAGT GGAGATTCAG GAGCAGCAAT GGTCTCACCA TAGCTGCTGG 2100 
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•lTT AAj Z TGC C TTCAACATAG AGAAAATTAA 3GCCTCACAG GATGAGTCTC CATTOTCTGO 



28<0 



AATCACACCT G AG AA C TG AG ATATACCAAT ATTTAACATT GTTACAAAGA AG AAAAu AT A- 2160 

CAGATTTGGT GAATTTGTTA CTGTGAGGTA CAGTCAGTAC AG AGCTG ACT TATGTAGATT 2 22*.: 

5 

TAAG2TGCTA ATATGGTACT TAAC 3ATCTA TTAATGCACC ATTAAAQXT TAGCATTTAA 2 280 

G T AGC AAG AT TO:GGTTTTC AOACACATGG TGA3GTCCAT GGCT3TTGTC AT3AGGATAA 2 34 0 

10 GCCTGCACAC CTAGAGTGTC G3TGAGCTGA C2TCACGArG CTGTCCT3GT G33ATTG3CC 2400 

TCTC3TGCTG CT33A2TTCT GGCTTTGTTG GCCT3ATGTG CTGCTGTGAT GOTGGTC 1'TT 2460 

CAT3TTAGGT GTTCATG3AG TTCTAACACA GTTGG3GTTG GGTCAATAGT TTCCCAATTT 2 52.) 

15 

C AG3 AT ATT T C G ATG T G AG A AATAA33CAT CTTAG3AAT3 ACTAAACAAG A T AAT G 3C AG 2 580 

TTTAGGCTGC ACAACTG3TA A AATG A GT GT AGATAAATGT TGTAATTAGT G T AC A C GT TT 2 64 0 

~>n m , nm*™*^~^ rrwr.-AT^.r TTTrrTtir-T ^iiri.Tri tc^atcttt" 2700 

V7 V_jirtlllli-i iJT-rt J. /lai. ^ ^.-^ -l - - - - - - — - - ■ — 

ATGTCTCCCT TTTTTTTTTG TCTATA'GCTG TTAGGTATTT TAGTGGTTG A AATGAGAGCT 
AGTG ATGAC A GAAGGATGTG GAATGTCTT2 TTGACATCAT T3TGTA3T0C r IGGTAATCAA 

25 

GTTGGTAACG AGTAGTTCTA GCACX3TCTTA CCACTATGAC TTA.AGTQ3TC CTGGAAGQ-A 

GTAAGTGGAG GT TTGG A3C A TTCCTGCCTT CATGAQOGCT TCTACCACTG ACCACTTTGC 

30 ACGTACCTGG CTCCCA3ATT TAG TT AGG T A CCCCACGAGT CGTCCACATA AGCAGGTTCA 

TCTTTACCTT C5CCAGAGTTG ACAA1 T ATGG 3ATACTCTAG TCTAGTTATA CTTGTGTTCC 

CATCTGTCTG C | 3ATCCTCTG AAG3CCA<3GA CCCAGT7ATA CATCCTTAGA AA C C AAAGT A 
35 

T'3GTTTTTGT TTTCTCTTGG AATGT3AGGT ( 3TTAAGG TAT TTAATTGAGG GACAAAAAAA 31r'J 

AAAAAAAGCC GATATAGTAG CTA3CTACTT .AA 1 ^ GAT 3' TAT GGGTATTGCT CCATATCAAA 3240 

40 GGAGATTTGC A3GACAGAAA GAGTAAATTA CrC 3TTGAGTC TTGGT TT AC A G3TTCCAAG3 3 300 

AGA3CCTTGG CGACCT(3AAA TGTTAACTCG GT 3CCTT3CT GTCTCTAGTT GATGAGGAC Z 3 30 

TG3AGATG3C T3AGTCTTGT TAGGGTTACT ATTGAATACA GTCCTTAGAT T 3 AG 3- 3" P ATG 3 4..0 

3 3T7TT2GTA TGCAGGCAIC TATT 2TGAAT CACCAO^GTTG "2 2 > ."A- - .T A3A3T 2 3 AT A. 3 4 0 
GGAGAAAATC C ATTTGGG T A GATG3CCTAT GAATTTGTAG TAGA3TTTCA AAATGAGTGA 



276 0 
2 82 0 
2380 

2 94 0 

3 00 0 
3 06 0 
3120 



3S40 



50 TTTGTTA OCT TGGTACTTTT AAGTTTGTTGG TACA3ATCCT CCAAACCCAT ACTCT3AGCA 3600 



i 6 6 0 



60 
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290 



(2) INTORMATICN FOF SEQ ID NC : 30: 

( i ) SEQVEX ICE CHARACTEF.I ITT I C S : 

(A) LENGTH: 1667 base pairs 
(Bi 'TYPE : nucl-ii: acid 
(C> 3TRANDEDNE35 : double 
( D 1 TO IKDEC G Y 1 i : ie a r 

(xi) 5KQUT-2JCE DESCRI PTION : SEQ ID NO: 30: 

TAGTAATTCA TTl'AA'^rCCT CTTACATCAG T AGCG ACAAT GAGTCAGATA TCGAAGATGA 60 

AGACTTAAAG TTAGAGCTGC GACGACTACG AGATAAACAT CTC AAA* 3AGA TTCAGGACCT 12 0 

15 GCAGAGTCGC CAGAAGCATG AAAIVGAA'TC r rTT' 3T AT AC C AAACTG-3GCA AGGTGCCCCC l^'O 

TGCTGTTATT ATTCO :CCAG CTGCTCCC7T TTCAGGGAGA AGACGACGAC CC ACT AAAAG 24 0 



10 



20 



CAAAGGCAGC AAATCTAGTC GAAGCAGTTC CTTGG3GAAT AAAA3CCCCC AGC TTTCAGj 3mO 
TAAOTTGTCT GGTCA^AGTG CAG3TTCA jT CTTGCACCCC CAGCAGACCC TCCACCCTCC >>C 
TG*3CAACATC CCAGAGTCCG GGCAGAATCA GCTGTTACAG CCCCTTAA-3C CATCTCCCTC 4/0 

25 cagt:-acaac ctcta itoag cctt:a:cag tgatggtcgc atttcagtac caagcctttc 430 

TGCT^CAGGT CAAXiAACCA GCAG^A^AAA CACTGTTQ3G gcaacagtga acagccaagc 540 
CGCrCAAGCT CA3CCTCCTG CCAT3ACCTC CAGCAGGAAG G3CACATTCA CAGAT3ACTT 

30 

GCACAAGTTG GTAGACAATT GGC-CCCGAGA TGC 1 3ATGAAT CTCTCAGGCA GGAGAGGAAG 
caaagggcac ATGAA1TATG AGGGCCCT7G AATGGCAAGG AAGTTCTCTG CACCTGGGCA 7 20 

35 ACTGTGCATC TCCATGACCT CGAACCTGGG TGCCTCTGCC CCCATCTCTG CAGCATCAGC 730 

TACCTCTCTA GGTC A CTTC A CCAAGTCTAT GTGCCCCCCA CAGCAGTATG GCTTTCCAGC 
TACCCCATTT O'X'GC TCAAT GGAGTGGGAC GGGTGGCCCA GCACCACAGC CACTTGGCCA 



40 



600 
660 



£40 

900 



G1TCCAACCT GTGGGAACTG CCTCCTTGCA GAATTTCAAC ATCAGCAATT TGCAGAAATC 960 



1020 
1080 



CATCAGCAAC 2 2CCCAGGCT CCAACCTGCG GACCACTTAG ACCTAGAGAC ATTAACTGAA 
45 TAGATCTGGG :3GCAGGAGAT GGAATGCTCA GQ3GGTGGGT G3GGGTGGGA AGTA3CCTAT 

AT ACT AAC T A CTAGTGCTGC ATTTAACTGG TTATTTCTTG CCAGAGGGGA ATGTTTTTAA 1140 
TACTGCATTG AGO~CTCAGA ATGGAGAGTC TCCCCCGCTC CAGTTATTGG AATGGGAGAG 1200 

50 

G AA< jGAAAGA ACAGCTTTTT TGTCAAGGGG CAGCTT , 2AGA CCATGCTTTC CTGTTTATCT 12 60 

ATACTCAGTA ATGA-3GATGA GGGCTAGGAA AGTCTTCTTC ATAAGGAAGC TGGAGAACTC 132 0 

55 AATGT AAAAT CAAACCCATC TGTAATTTCG AGTGGGT3GA GCTCTTGCTT TTG 3T ACATG 13 8 0 

CCCTGAATCC CTCACTCCCT CAAGAATCCG AACCACAGGA 1 3AAAAACCAC CTACTGGGCT 
CTCT<~CTAC^ CTGC 3CTCCT CCCTTTTTTT TACCCCTCTC TTTTTTATTT TTTCTTTGCT 

60 



1440 

1500 
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CTTTAGAACC C AGTG AAAAA TACCAGGGTA CTGGGGTGCA ACTCTTTCTT AT GAT AGG TC 15 60 

ATTAGTGCTT TAAGCAAAAG ATATTAGCAG CTTTG AC TOO AGCATTAGCA ATT A SG RAAA 1620 

5 AAAAAAANWA AAA ACTC G AG GGGCGGCCCG GTT AC C CAAT TCGCCCT 16 67 

10 (2) INFORMATION FOR SEQ ID NO : 31: 

( l ) SEQUENCE CHARACTERISTICS: 

lA) LENGTH: 1408 base pairs 
< B) TYPE nurleic acid 
15 \C) STRAIJDEDNESS : double 

; D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

70 A TT AC AO A C C TGAGCACTGT OrCTX^CCAAG ACCTGTCTTA ATAGATTAGA GAACCACTGA 6 0 

TAGATGGTCA GCTTTCTGTA G2AGTGAGAA CCCTACATTT CAAATGTOCA TAGCACCTTT 12 2 

GCGGGGAAAC ATCACTTGGC A CATC TGC AT TCTTTTTTG A CACAGGGTCT CACTCTGTTG 130 

25 

CCCAGGCTAG AGTGCATGGC ACGATCTTAG CTCACTGCAA CCTCCACCTC CCAAGTTCAA 24 0 

GCGATTCTT 2 TGCCTCAGCC T2CTGAGCAG CTGGGATCAC AGACATGCGC TACCATGCCC 3 00 

30 A3CTAATTTT TTGTATTTTT TGTKTGTTTG r rTTTTGTTTK TAAGTAGAGA CGGG2TTTCA 3 60 

CCACGTTGGS CAGGCAGGTC TCGAACTCCT GAMCTCAGGT GATCCACCCA CATCTGCGTT 42 0 

C C AAT ATCTT TCTC:AACATA ATGAT AGO CG TAATTAATAT TTTCCAGTAC ATTTTTATGC 4 80 

35 

CTTTACACAC GAGAGTGGTA GACAGACACA AACCCAGATC TGTCTGACTC CAAAGCCCGT "54 0 

TTGTCATCAT TCCTTTTACG GTATCCTATA GTGGTATCCT TTA 2AGAAAG ACAGCTTTTA 600 

40 2CCAACAAAG ACTTAACTTC CCAGGATGCC AGAAGGACAA AGO-3GGATTG CTTTTAAGPA 66 0 

GFAAGTTATC AAGAMCTTAT TTTATAAATG AG ATT AG AT A GGGAAAGGCA ATTTATCTTT 72 0 

ATTAAAAAOT GAAAA2GCCA G:iATA<2GGAA GGAGGTCCTT CO TIOGTCTT TTT < ; ' AGGG AA 2r'0 

45 

ATACTTCAOT TG7ITTTATT AGAAACAG2CT AGTACCTAAG G7TCTOACGT AGGVJACAGCT 84 0 

TAAGGCATGC TAATGKTCAT GGGTC2TTCC AT AGTC ATTT TK 2TATTTTG GTTCACATTT 900 

50 GAQ7AATA3G CAG2CCTTCA CTGGTGCT<2G AYTCATTCCT CO 2 A YT ATT A CAGGTGACAG 960 

AGGAGACAGG AGGTATGT7T TTTCTA1 , TTT TAWACATOCT TT AT ATTT AA 2 AC AAGCTCT 102 0 
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-ATGGA TTCTTTTGCA AAGGTGAATA TTTTTCTCTT 13 20 

7TCATT TATTAGTCTT GC T AAAAAAA AAAAAAAAAA 13 30 

-ATT 140 8 



(2) IIIFGG:GVTIG"] G2?, 5Zy 23 -C: 32: 

(AG L22NGTH: 20 31 base pairs 
15 (3) TGG2Z : r.ucleic acid 

( C ) 5TGAA 7G EG ITFS. 5: doub 1 e 
(D) TOGCLGGY : lineax 

(xi) SZ-C1.^2X2I-: DESCRIPTION: SEQ ID NO: 32: 

20 

AGGATATGCA TC-ATTCCTAA CGA3GCTATA TGTT AAAAAA AAATTAGAAA ATOCAATACA 60 

TTTTTTACTA TArAAACTAC AGAATG AGT A TGGAAGTTTT ATTTATCAAA ATGTAATGGA 12 0 

25 TTTTTAAAGG CTGLAGAAATT TTCCTTATAC CTACCTTTTC AC«TTATTTI'A ATTATACCAA 13 0 

AGTTATCAACT AjGAA.TA.GCTT CATCCATATG AAATATAAAA TG-AAGAGACA CCTAGGCTCT 240 

ATOAGGCCTA C^GTTCTTTO AA2TTATTTC CACTTTAATT TCTCAGTGGA AG TT AAGAGG 3 00 

30 

GGTGAGAAAA CAAAGAAGGG GAAAAACTGA CAACTAACAA AACCA2CACC ACATCGCTAG 3 6(3 

G TGGTGCTT A CTAA^TTACCT TCTOACX2ATT TTCCTCAGAT TGAAAA<3CTT ATC-AGGATTT 42 0 

35 CTCGGG AGT G TTAA.TAAG.rT G-GG TGTT AGT AG AGAGCTTT G CTGA TGA T A TTTAGTGTTG 4R0 

AGCACATGTG GTTGTAAAAC CTTAAGTTTG TTTCTCCAGG AGGGTGGTGA TAGAAACAGA 54 0 

-v^"-GT-^T T ATGA A.CT3A TGTTCTGGTG AAATGTTGAG GGTGGOGAG A AAAGAGTTTA 600 

40 

AGGGA.GG-AGA GCCATCTATT TTOTTCCTAA AGCCACCTCT CAGCAGAATG GTCATGTTTT 660 

TCTGATGCAC GC<T>2TGCTT CATGC CCAAG ATGAGTTGGG AGGCAATCTG AGGAGCTGTG 720 

45 ■GACTTAAJGC?. TTOCAAAGCA CACTCTCTTT CTCAGCGTTG TCTGCAAGTC AGTAGGTGTT 780 

AGTATGCTTG CAAAG TTCAG TGTCTG AGG A AAGTTGAAGT GGGCTACCTC TCTACAGCTG 84 0 

TTTCCTCAGA GGGAA A AAT2 TTGAGAGCAG ATGGTGGAGC TCTGGAGTCA GAGGAAATGG 900 

50 

GTGTCTTCAG CACAAAjGCTG GGGCTTTTAC TTGAGCCACT TCTGACATTT TTACATACCG 960 

AGCCTGAJGAT TRTGTGATTA TCTCAAATCA AATCACTTTG ATGGAGATAA ATAATGAAAA 102 0 

55 GTGTTTTATA GTCATTGATT TGGTGAGAAC AGTAATGGAA AAT2GTGTTG AAGGAGTTCT 1030 

CATCTTTGGA CKTTTCCTTC GAjGAGTCCTG GGTGATTGGT GTTCGCTGTT CATCTGAGCC 11 W 

GGGA AAAGCA TTATTACTGA T AGTTGGAC A CAGTCAAAAG CGCAGACTGG ATGGATGGTC 1200 

60 
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T TTT AT AAG3 CATTTAAG3G TACACTACTG TGTTTCACTG ACCATACATT TTTCTTAGCC 1260 

CCTCAAGTAA TAT AGO AC AG AGTTATG.AAT GACAATTC CC CTAACCATTC CTCTTCATAT 132 0 

5 CTGCCTCTTC CCCTTACCAT CGTAATT0T0 CAAACT03TC AT AAAO 3C A' Z TCTGTGAAGA 13 80 

TATTGG3GAC TGACATCTTA AGCTCTCAC3 TGGCTGCAOr AG3AAAG3CC AAA2TGACGA 1440 

CAAAAAAAAA ATTCTTTATA AAGATGATAT GGTAACATGT ATCTTTGOCC TGGGTCT3GG 150') 

10 

TGGGTCCAGT C A 3TCT GAGA TTTAGAA02A TTTAGGAGrG TA3GTAAAAG CTGCTAGTAT 156-;) 

TCTTTTAAAA GTTACATTTA TGACTTGCAA IGATAGAAAA GT 2CTTCC AA TTAAATG3CA 162 0 

15 TTTTATAYTA TTATGTGTGT ACTTGACAGT G TT AAAAAT A CC3TCATACG TTATTGCATT 1680 

TGATCTTCAC A 3AAAGT OCA TTTTAACCAG TACTCTG3GT G7AATAAATA AT ATG TAG Ah 1740 

ATTTAAGTC 2 TO -AATTGGA (3CATATCCAG TGAGTTTTGA CA3TGTGTTT ATGTGGAATG 1300 

20 

TTTAAG3ATA TAGAATTGTA CTTTATATAA ATT 3 3TTC TT GTTCTTCTTA AAT3T3ACAT 1860 

GAAATAATTG T30TO3TACA TTATACT3GA AATTAACAG3 G3AAAAGGGA AGA3CTCTT3 192 0 

25 '3CTCCCTrGA G3TTCTG:TA GT3GTGTTAG GAGTGGTTAC AAC TGAG Z TT TTAGTAACCA 198 0 

TTTAACCGTA TGTAAACTTG GTTTCTAATT AAAAAAAAAT TTCTTTTTCC A 2031 

30 

(2) INFORMATION FOR SEQ ID NO: 33; 

{ 1 ) £ EQUENC E C HABACTE RISTICS : 
35 (A) LENGTH: 971 base pairs 

(B) TYPE: nucleic acid 

(C) STRANIEDNESS : double 
(L0 TOPOLOGY: linear 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

CGCGTCGGAA CTCGGCCGCG '3GA2ATCCAC GG033GCGAG T 3 ACACGCGG GAGG3AGAGC 60 

AGTGTTCI3C TO- 3 AG2 C G AT 0CCAAAAACC ATGOYTTTCT TATTOAGATT CATT3TTTTC 12 0 

45 

TTTTATCTGT GG3GC3TTTT TA 3TGCrCAG AGA3AAAA3A AAGAGGAGAG CACCGAAGAA 13 0 

GTGAAAATAG AAGTTTTGCA TG3TCCAGAA AACT3CTCTA AGACAAGCAA GAAGGGAGAC 240 

50 CTACTAAATG CCCATTATGA CG3CTACCTG GCTAAAGAC3 G2TCGAAATT CT AC TGCAGC 300 

CGGACACAAA ATGAAGGCCA CC 3CAAATGG TTT3TTCTTG GTG TTGO OCA AG TO AT AAAA 360 

^OCOTAGAOA TTGOTATOAC AGATATOT— " '03TCGAGAAA* AOCGAAAAOT AOTTATACC" 4?" 



00 
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C AAAGGGAAT TTG AAAAAG A T< J AGAA' ^ 2 C A 
GAAGATATTT TTAAGAAGAA T" j AC CAT j a.' r 

aatgtatac 1 o aa : acg atg a act at a sc a: r 

TTTACTGT A 2 TTTA FGTATA AAACAAAGTC 
10 CCCCTATGAS AA3ATATTTT GATCTC2CCA 
GCTGTTTTGC AAACTTAAAA AAAAAWWAAA 
C CG MAT ATG A T 

15 



CGTGACAAGT CATATCAGGA TGCAGTTTTA 6 60 

Q OTG A'FOGCT TCATTTCT CC CAAGGAATAC 7^0 

ATTTGTATTT CTACTTTTTT r rTTTTAGCTA 78 0 

A2TTTTCTCC AAGTTGTATT TGCTATTTTT 84 0 

ATACATTGAT TTTGGTATAA TAAATGTGAG 90 0 

AAA ACT SG AG GGGGGCCCGT ACCCAANTCG 9 60 

97. 



(2) INFORMATION FOR SEQ ID NO: 34: 

20 

( i ) SEQUENCE CHARACTERISTICS : 

<A) LENGTH : 17 92 ba;;e pairs 
■IB) TYPE nuclei: acid 
<C) STRAIJDEDNES3 . double 
25 { O ) TOPOLOGY : 1 i near 

(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

GAACCCCCTT TCTCCTGGTA AAGGGTAAC-G GGGGGGATAA TGTTTACCAC AGGTACGAAA 6 0 

30 

TAG TC ACTTT AACATTGAGA CCTCTGCCTC ATTGAATTCA GGTTTTTTAA GTACTTGAAA 120 

CTCTTCAGAT TCTCCTTATT TTAGTTTCTT TTTACATTTA 1GAAGTAGAA AGCATTGTTT ISO 

35 TGTAAAOTOT TTTGAAAATA A AT* AOS CT AG TCTCTTATCC TCTTTAGCGT GGATTAAAGG 24 0 

TGAAGTTCTG CAAATG3GAG AGTGTT SAO A GTAGATAGCT CAGATTGATT GAACACATTT 300 

G AGG AA- SAGA CTCCTGCATG AGATACCAGC ATTTTTACAA ATACTTTTTA TGTACATTCT 3 60 

40 

TTATTTTGTC ATTTTGTCAA CCCTCTCCCC AAGCACATCT TCTTTCCTTT TACTATGTCT 42 0 

ATGTAGGGAA AAACAAAACA AAAAATTGCA CTTACGTTAC ACTCCCAAAA TGTGGGTAAT 480 

45 CCGTGTCTTT CAAAAAACAT TTCTGTTTTT TGTTTTGTTT TGGTCAGTCC ATTGCATAAG 54 0 

TGACAAGTTT GOSTOCTTGT GGCACGTATG TATGAAGCGG GAGGGGGATG AS AATTGC CT 600 

GTCCTTCAGT ARGCTGTAAA AGTAATTTAC ATGTAAGTAA AAAGGGAAAA TAGAATAGAT 660 

50 

GCCAAAGTCA TTTATTCAGT CCTTAGTTTT CTTATGTGGC ATTACTGCAT CTGCTAGTTA 72 0 

GTGAGAAA* SC ACCCTCAGCT TTTACTGCTC CCCTCCCTGC CTGCCAACAC ACTTGATGTG 780 

55 TGCAAACAGC CCTCAAGTAT CTGTCAGATG AC CT AT AT AA GGTATTGAAT AAGGTATTCT 840 
TGTCAGTTTA G AAATGG AC T GGATAAAACT TACTTGGTTG TCATTATTTT AT C TC ATTTG - - 90 Q - 

TCCTGTTACA TGCCCTATGT TAAGATAATT ATATTGCCAC TAATAATCAA GATGCTAAAT 960 

60 
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GAGTATTACA AC TGGCT AAT AT3ATTTTTT ATATACAAGG GTATGTGTAT ATTTGGAATT 102 0 

GRTATGAGAA ACTCATTTGT AC 3CATTTGA GTGATATTGC ACAACAAACA CAG AT A Y CTA 103 0 

CAGACTCCGT TTT'ZATTTTC TC3TGTTCTT TATGATAATG ATCTTTGTAG ATTGGTTATT 114 -.J 

TCTGTACTTT ATCT3TAATA AA3TTTGTAG ATCCTGTGAA CCATTACTTT GOCTAAATCA 1200 

3TTGAGACTT GAGTGTTTAA T AACAAA 1 3C A TCAATATTCA CTAAAGTCAA TCTCTTTTGA 1260 

GTTTCTGTGA CTTG3CTAGA AG3TCTTGAC ACTAAGG3AT TAGTGTTAAT TTTCCCTGGG 132 0 

3GTGTT0CAC TAGGOCATTA CT3TATA-YTG ACTTGAT3TT G3CACATA'3A CTTCAAGATA 13 30 

T A T.-V\T ATT ' P TG AG 3ATTTT GTTGATrGGC CTAT3TTTTA TTGCATAGTG TOAAACGTGT 1440 

AAAGITTOGT TAA3CTGTAT AT AGA 1' A 3C' T TATTGTTGAC TAGTTATAGT GTATTTAGGG 150- ) 

TTOOCT' 31' AA TATTTAAGCT TCTTTACTGA TGTGTGTGCT G3TAGGAACA TATAATTTTT 1500 

OT AC ATT ATA TTTA 3TGACA TGTTGCCTTT TTTATTTTAC AAATACTTTG GAATTC OAAT 1620 

GTGTTTTTTG CTTCCGTGAG GATTAATTTG GAAAGGTTTT T AAT 3 AC ATT CCACTGATTT 1630 

CAGATTTTGC TT3AGATTGA 1 3TTCAATAAA TTGTCCTGTA TGTTCC AAAA AAAAATTAAA 174 0 

AAA3TCGAG3 G33GCCCGGT ACC'3AANNCG CCG3ATATGA TCGTAAACAA TC 17 20: 



(2) INFORMATION FOR SEQ 10 NO: 35: 

( i ) SEQUENCE CHARACTERISTICS : 

- A ) LENGTH: 896 base pairs 
■. b; ) TYPE : nuc leic acid 
■■ C ) STRANt EDNES S : double 
■.D) TCPC LOCY : linear 

(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 



AGTTGNANA 3 AACA3GACCT GA3T3CTTGG GCAGCACCAG TAGGTTGCCC CYTOCYTCYT r'J 

GC3AG2YTCA C YTG3CACYT TYTGGCCCTY TOGGGATG 3C TTCG2AGACA GAGYTYTT 3G 12) 

0T3C3T3TG3 T GOC 3AYTCT TT 3 2TTTTGG TTYTCTPGGO CCTTG3C3TC 20TTTTT3TC 130 

CC2G3GCA3C CTT3T3T3AC 2T3CCCTTTT CCCTCCCTTC CTTTCCAGGA CAAG2ACGCC 2 40 

GAG3A3GTGC G3AAAAACAA G3AG3T3AAG GAAGAGGCCT CCAGGTAAAG CCTAGAGGCC 3 3 0 

AAAGAA3TTT CCAG3TCAGC CGGACA-3CTC CA'3CAG3TCC ACGTTCCAGG CAGC2TCGMC 360 
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ACATTGAGCC TCCCAGOITAC CATGTTGAGG AGAGATGAAA ACCAGGGCGG TAGAACTTCA 66 0 

GGGTGAAGGA CAGAGTCCTG GGTGGGGCAG CGGCT<3CAC-G GC ' X AC C AG A GAACCCAGCC 72 0 

5 

AGAGGGG2TG TGAGTACCAG TC-GTGTTGCT TCCACCCTGC AOrAGGTGGG ATGAGGTCTG 7 30 

TGTGTGTGTG TGAACCATCA TTTTTT^jATC ATCATGACCA ATGAAACATT GAAAAAAAAA 34 0 

[0 AAAAAAACTG G AGG GO GGC C CGTACCCAAN TC GCC GMAT A CTGATCGTAA ACAATC 896 



15 (2) INFORMATION FOR SEQ ID NO: 36 

(i) SEQl 



20 



ENCE CHARACTERISTICS: 

A) LENGTH: 912 base pairs 

B) TYPE: nuclei: acid 

C } STRANDEDNES S : doub 1 e 
D) TOPOLOGY: linear 



(:a) SEQUENCE DESCRIPTION: SEO ID NO: 36: 

25 TCGACCCACG CGTCCGOTCA GCCAGTCGOA TCCAGCCATG ACAOCCTTCT GCTCCCTGCT 60 

CCTGOAAG2G CAGAGCCTCC TACCCAGGAC CAT) 3GC AGCC CCCCAGGACA GIXTCAGACC 12 0 

AGGO^AGGAA GACGAAG3GA TGCAGCTGCT ACAGACAAAG GACTCCATGG C -AA3GGAGC 180 

30 

TAGGCCCGGG a^GAKCGOrG GCAGGGCTCG CTG^GGTCTG GOCTACACGC T-3CT-3CACA^ 24 0 

CCCAACCGTG C AGGTCTT 2C GCAAGACGGC CCTGTTGGGT G2CAATGGTG CCCAGCCCTG 300 

35 ARGGCAG3GA AKGTCAACCC ACCTGCCCAT CTGTGCTGAG G2AT jTTCCT GSCTACCATC 360 

CTCCTCC2TC CCCGGCTrTC CTZCC AGS AT CACAC2AGCG ATGCAGCCAG CA^TCCTC 2 420 

GGATCACYGT G^TTKGGTGG AGGTCTGTGT GCA2TGGGAG C CT CARGARG GCTCTGCTCC 4 80 

40 

ACCCACTTGG CTATGGGAGA GC '2AGCAGGG GTTC TGGAG A AAAAAACTG3 TGGGTTAG3G 540 

CCTTGGTCCA GGAGCCAGTT GAGCCAGGGC AGC2ACATCC AGGC3TCTCC CTACCCTGG2 600 

45 TCTGCC ATC A GCCTTGAAGG G2CTCGATGA AGC2TTCTCT '3GAACCACTC CA3CCCAGCT 660 

OCACCTCAGC CTTGGCCTTC ACGCTGTGGA AGCAGCCAAG GCACTTCCTC ACCCCYTCAG 720 

CGCCACGGAC CTYTYTGGGG AGTGGCCGGA AAGCTCCCSG 3CCTYTGGCC TGC AGGGCA 3 7 30 

50 

CCCAAGTCAT GACTCAGACC AGGTCC CAC A CTGAGCTGCC C AC ACT CG AG AGCCAGATAT 840 

TTTTGTAGTT TTTATKCCTT TGGCTATTAT GAAAGAGGTT AGTGTGTTCC CTGCAATAAA 900 

55 CTTGTTCCTG AG 912 

60 (2) INFORMATION FOR SEQ ID NO: 37: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1332 base pairs 
(3) TYPE: nucleic acid 
5 (C) 5TRAMDEDI JE5 3 : double 

(C) TOPCIjOGY . 1 mear 

(xi ) 5-EOUENCE DESCRIPTION: SEQ ID NO: 3^: 

10 AATTCGGCAC GAGC :5GAG3C GAGGGAAACT RAGCGCGAAA GTTCTGTGTC GTGTTGGCAG 6 0 

GAGGG-CCTAG AAO 3C> AAAG A CTGTOTAGT ? GGACAATGTC ATATTATAAA TTTGGAATGC 

TGAATAGAAA ATT A T AG ATT TTGATATTGA AGGAAAT3AA GCGAAGCYTA AATGAAAATT 

C AGC TC ■ 3AAG TAZAGOAGGC TGTTTGCCTG TTCCGTT3TC CAATCAGAAA AAGAGGAACA 

GACAG3CATT M'nT'JT.-UT CCACTTAAAG ATGATTCAGG TATCAGTACC CCTTCTGACA 



15 



35 



GAAGTCAAGA TTCT3TCTTT AACTCTATTC AATCAAATAC TGGAAGAA 3C CA&3GTGGTT 



AGC CTC AATG TAAACGAACA AACTTAGTGG CAAATGATGG AAAAAATTCT TGTCCAATGA 



120 
180 
240 
300 



Z( J ATTATGATTT TGCT ! J'J'IMA i_ - i l l^jv.^. i - * — ^ w ----- - - - 

CTCCTGTAAT GAAAACA3T3 GACACCGGGO AAATACCACA TTCAGTTTCT CGTCCTCTGA 42 0 



430 



GGAGCT AC AG AGAT3GTAAC AAAA AT AC C A GCTTGAAAAC TT3GRATAAA AATGATTTTA 54 0 



600 



30 GTTCGGGAGC TCAACAACAA AAACAATTAA GAACACCTGA ACZTCCTAAC TTATCTCGCA 660 



720 
780 
84 0 
9C0 



ACAAAGAAAC CGA3CTACTC AGACAAACAC ATTCAT CAAA AATATCTGGC TGC AC AATG A 

gaggggtaga caaaaacagt gcactacaga cacttaagcc caattttcaa caaaatcaat 

AT AA<" IAMAC A AATGTTi; JGAT GATATTCCAG AAGACAACAC CCTGAAGGAA ACCTCATTGT 
ATCAGTTACA G7TTAAC,GAA AAA3CTAGTT CTTTAA<1AAT TATTTCT3CA GTTATTGAAA 

40 GCAT iAAGTA Tr03CGT3AA CAT3CACAGA AAACTGTACT TCTTrTTGAA GTATTAGCTG 960 

TTCTTGATTC AG:TGTTACA CCTQ3CCCAT ATTATTCGAA GACTTTTCTT ATGAGGGATG 102 D 

GGAAAAATAC T'TIGOCrTGT GTCCTTTATG AAATCGAT :G TGAACTTCCG AGACTGATTA 1080 

45 

GAGGOCGAGT T 3ATAGATGT GTTG3CAACT ATGACCAGAA AAAG AACA j. i TTCCAATGTG 11*0 

TTTCTGTCAG ACCG3CGTCT G TTTC TGA- 3^ 2 AAAAAACTTT CCAGGCATTT GTCAAAATTG 1200 

50 CAGATGTTGA GAT3CA3TAT TAT ATT AATG TGATGAAT3A AACTTAAGTA GT 3 AT AAAAG 1260 

G AAGTTT AGC ATAAATTATA G3AGTTTTCT GTTATTCCTT A \TTT ACC AT CTC CAT AGTT 1320 



60 



10 
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(2) INFORMATION FOR SEQ TD NO: 38: 

( L i SEQUEt ICE CHARACTERISTICS: 

(A) LENGTH : 37 2 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDE DNES S : do ub 1 e 

(D) TOPOLOGY: linear 

(xi ) SEQUENCE DESCRIPTION; SEQ ID NO: 38: 

GOSCTACTTC AAAGCCCTGG GCCTTATTTC TTCAGGTAAA AAAATATAAA GTCAGATCTC 6 0 

ATCCCGGC1G GC C ATGCTG T TAGACCCTTT CATCCTTCTC TTCTGCCTCT T C T CAACAGC 120 

15 TGC 7CAGTCC TCTTT03AAT T CAT AT AC AT ACAGTTCTAA TACTGATGTA TTTACCCTCA 180 

TAA3CCACT; AACirCAGAAT CTTATTTGAA IT AT AATC C A GAAACATCAG GTGACGTGTG 240 

AG ACTACTC T AT G AG AAAGA GACAGTTT.AA OSGTCAGTCC AATGGAAAAA AGAGTTCTCA 3 00 

20 

GAGCTTTCTT TAGCTTATTC TCATC AAAGA GCTTTCTCTG CA-.3AA3GAAC CTACTGGTTC 3 60 

CTCCTTTCCA GTCCTAGAAA T2CTGACCTA GAGT3GOTTA ATCCTGCTAG CACCTCTCTC 420 

25 TCGCACTCTG G T 3C C A AA T 3 AC T C C AGS AA CTGG3CCATG ATGTGGTGG 3 AATGACCTTA 480 

CCCTGAGCAT GTCACTCATG C ATT j 7 vAC AA CAGCTAAGAG CAGAGCTTA;; AGCTTAGAGC 54 0 

TGCGO^TGT AAGG T 3 AG A 3 GAATCACATC CTGCAGAAGT CTGTCCTGAG AAGCAGGTAC 600 

30 

TCCTGTCACA GCAGAGACAC AGTGGATACC TGAGTAACAA TAATACAAGA CAGGACGTGG 66 0 

CMACAG3AAA A:3ATrTG3GT GTCAGAAGAR GC CG AG AAC A CTTYCAGGCA GGAACATTCA 72 0 

35 FAETTOTTCT TX.AGGAAFT AGGCMCSAAG OCTCGGCAG3 ATTTCMCG3G GCAGAGATGG 730 

AGCAA3CAAT TGAAATGAAA GCCATGGCAT GGGAAAAGGA GCACTGGC3A CAGGGAGTGC 840 

AACGTTGTGA TGOAAOSCCA CTGTGGAGCC AT 87 2 

40 



(2) INFORMATION FOR SEQ ID NO: 39- 

45 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 812 base pairs 
<B) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
50 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

GGCAGAG3CT CACCCCAGCA GAGATTGAGG GGGAACCGTG ATGAAATTTT TAAGTATTCT 60 

55 

'3CTTGATGAT AATAATTTTY CTCTTATGTT AATGTTGGCT C2GTTT3GGT GTTTAGCTTT 120 

TGAAAGGAGT ATGAAAATGC GGAATGGGGC TTT3GGGCTT GAGGAGGTGT GATCTCTAGT 18 0 

60 GTTTAAAAAA TTTAATTGCA CAAATAGAAA TAATTCACCC ACATTATTGA ACCCCACTAA 2 40 
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AGCATATCGT TTTTGTCCAT ATTCCTTTCC TGCTGCC2TC GTGTGTAG'OA TTATTACTCA 300 

GTTGTGATTT GAGCTCGTTC CACTTAAAGT CATTCATAGA TACTTTTGOG TCGTGTTKGA 3 o L> 

ATATTTATTG AATTTCTATT CTGTGTTTTA GTTAATTAGT 'ITATTATG3A ACCTTTACAC 420 

AGGTCTG3TG TACTTGTTCT TTGAAAA3TC 'IT AT 1 3 TTG A C CACCATCAIT GAGCATATAG 4 80 

GTTTTTG 3TT ATTTCCTTGG GATAATTACC CGAAGTG jAA ATACCGAATC AAACTTCTGT 540 

TTTCTTTCTT TG3CA OTATT ATATAAATTG TTTTCCAAAC AAG3GATGTT TACMTAGAC 600 

ATTTTTCAAA ATC7G-3GTAT TTCTCCTATT TTGCTCTCTG TATG3AGAAT TGAGGGG3GT 660 

GCCAAGTCGT TTTCTGTGTG 2-GTTGAGAGA C AGGC TG ' TGC AGC3CACT3T T03GATAGGAC 72 0 

TAACTACTAC AAATCATGCT GA3ACCGAGC T70TTTT3CT GCTTAGAR3C TTTGGAGCGT 7 80 

TGAGTAAGTT TCGNGATGTG 1 jA/WC r J ' i " 1 GN AA 812 



(2) INFORMATION FOR SEQ ID NO: 40: 

{ i ) SE^.JEnCE CHARACTERISTICS : 

(A) LENGTH: 1515 base pairs 

(B) TYPE: nuclei: acid 

(C) 3TRANDEDNE33 : double 

(D) TOPOLOGY: 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4!.'. 



AATTCGGCAC GAQ3GAAATT CAAGCACTTT TCCTAAAAGA AGGGG3AATG GATCGTGAAA 6 0 

CAACACGTNT CC2ACAAAGG GAG3AGACAC T<3GGCTTGTG AACX7TGCCCC ATACCTTCCC 120 

CACA3AACTG GGGTCCGGCC TCCCTGACAT G*2AGATTTCC AGGCAGAAGA GAGAGAAGGA 18 0 

GCGAGTGGTC AT* 3GAATGGG CI>3GGGTCAA AGACTGGGTG GCTGG3AGCT GAG* 3G AGCG A 24 0 

CCGTTTGAG: CTO3C0AGCC CTCTGGACCC CGAGGTTGGA CCCTACTGTG ACAGACCTAC 3 0 0 

CATGCGGACA GT 2TTCAACC TCOTCTi ^VZT TGCCCTi 3GCC TGGAG2CGTG TTCACACTAC 3 60 

CCTGTGAAAG TCA iYrGCCA AAAAA3CGG.: OTCAAAGACG CTGCT3GAGA AGA3TCAGTT 410 

TTGAGATAAG CCG3T<3CAAG ACCG3GGTTT ■ 3GTGGT 1 3 AGG (3ACCTCAAAG C TG AG AGTGT 48 0 

GGTTCTTX3A 3 CATCGCAGCT ACTGCTCOGC AAAGGCCCGG *3A 3AG ACACT TTGCTGGGGA 54 0 

TGT A CTGOGC TATGTCACTC C ATG 3 AAC A- 3 CCATGGCTAC <3ATGTCACCA AQ3TCTTTGG 600 



WO 98/54963 



PC T/US98/1 1422 



CCGGAACGTC T T A f 1 AC AG T : ^ AGGATGAGAT 
i^X.aAAH^v: CAGCATTTCG ATGGCTTCGT 
5 GAAGCGCGTG ACCGACCAGC TGGGCATGTT 

o ot 3. :T' j at ^ jGttt ( r ag^ : 1 2 tg^t 3 ac : t a 

TAATGCACCC CTGTCCTGGG TTCGAGCCTG 

10 

GCGAAGCAAA ATCCTCCTG3 G30TCAACTT 
T3CCC3T3AG CCTGTTGTC3 GGGC 2 AG* 2TA 
15 GAT3GTGTGG GA3AGC 3A33 YCTCAGA3CA 
GAGG2AC3TC GTCTr3TAC3 CAACCCTGAA 
G3AG2TG3G3 GTTGGGCTCT CTAT2TG3GA 

20 

CCT3CTCTAG GTGGG3ATTG CG3CCT 2CGC 
TGAGTGAGCA GGTGTGAAAT ACAG3CCTTO 
25 AAAAAAAAAA AAAAA 



300 



AG AG 1 3AGCTO AGCAACACCG TOGTCCAGGT 34 0 

GOT' j jAGGTC TOG AACCAGC V72 AGC 1 2 A 900 

CACG2ACAAG GAGTTT<3AG2 AGCTGGCCGC 960 

CGAGTACTCT AC\GCG3ArC AC-CCTGGCCC 102 0 

CGTC2AG3TC CTGGACCCGA AGTCCAAGTG 1080 

CTATGGTATG GACTAC 3CGA CCTCCAAG3A 114 0 

CATCCAGACA CT0AAG3ACC ACAGGCCCCG 120'- 

3TTCTTCGAG TA2AAGAAGA GCCC-OAGTG3 126:' 

GTCCCTGCAG GTGCGrrGTGG AC;CTGG2CCG 132 0 

GCTQ3GCCAG G02CT'3GA2T ACTTCTACGA 13 So 

GGTGGACGTG TT'GTTTTCTA AGCCAT'GGAG 144 0 

ACTCCGTTAA AAAAAAAAAA AAAAAAAAAA 1500 

1515 



30 (2) INFORMATION FOR SE2; ID NO: 41: 

(l) SEQ0T2NCE CHARACTERISTICS : 

(A) LENGTH: 7 04 base pairs 

(B) TYPE, nucleic acid 

3 5 { C ) STRAi JDEDNESS : d o ub 1 e 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

40 AAGATGGTGG CGCCCAGAGC TTCGCTCTAT GCTGCTCCCG TGAGAGAGGC GTTTCCATCA 6 0 

ACC AGTTTTG CAAGGAGTTC AATGAGAGGA CAAAGGACAT CAAGGAAG3C ATTCCTCTGC 120 

CTACCAAGAT TTTAGTGAAG CCT3ACAGGA CATTTGAAAT TAAGATTGGA CAG2CCACTG 180 

45 

TTTCCTACTT CCTGAAGGCA GCAGCTQ33A TTGAAAAG2G G3CCCGGCAA ACAG3GAAAG 240 

AGGTG3CAGG CCTGGTGACC TTGAAGCATG TGTATGAGAT TGITCCGCATC AAAG2TCAGG 300 

50 ATGAG3CATT TGCCCTGCAG GATGTACCC 2 TGTCGTCTGT TGTCCGCTCC ATCATCGGGT 3 60 

CTOCCCGTTC T C T 1 3GG3ATT CG2GT3GTGA AGGACCTCAG TTCAGAAGAG CTTG2A3CTT 42 0 

TCCAGAAGGA ACGAGCCATC TTCCTGG2TG CTCAGAAG3A G3CA3ATTTG GCTGCOGAAG 480 

55 

AAGAAGCTGC CAAGAAGTGA CCCTTGC 2CC ACCAACTCCC A 3 ATTTCAAA GGA3GTAGTT 540 

GCAAAAGCTG TGCCCAAG3G G AGGAAG 3 AG GTCACACCAA TATGATGATG GTTTTCATGA 600 

60 CTTTGAATGA TATATTTTT 3 TAG ATCT AGC TGTATCGAGG CATCAGGCCT GAATAAACAT 560 
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CCTTTCTTAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAA 7 04 



(2) INFORMATION FC>R 3EQ ID NO: 42. 

(l) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1094 base pairs 

(B) TYPE : nucleic acid 
<C) STRAMDEDNESS : double 
( D ) TOPOLOGY : 1 inear 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

GGrAGCHTC TTACAAACCC ATCCTTCTGA AATGTTGCTT CAAATTCATC CTCTGCTCCC 60 

CAGTCCCACT A r rTCCACACA TACTGTTACT GTTTCTTTAT CCTACTTTCT CV.TTTTXA 12 0 

20 

ACATAGTTGC AOTTACTGCA TTGAATACCT GTGGGTTT<3C CTGTTGTTCT GTCTCTCTCT 130 

GT3GTTCTTG TAATANTGGA TCCCAGAGAT AAAATGGACA GTTGTNATOC ACAGTTAATT 24 0 

25 CAGAAACTAG ACCTTACTTG CTGTGTGAAA TACCAACTAA ATTCTCAGTG AACTCAGCTG 3 0 0 

ANCTTTATCT CCTTTTGTTT CCCCAATTTA TAATTT2A2-T TCAGGCCCAG AAAGATGGAA 3 60 

TCCCAGCTAA GAAATA2AAG TTACACCCTG TACTAGCAGC CCATGTGTGC ATGTTCTTTA 420 

30 

AGTOCTCTTS CAGCTATGTC ATTTATATTG ATTTCCCTGT ATT ATT AT AA GCAAAGCAAA 4 80 

T'TT OA 03 AAA AAAA^CCATA ATACCA'SAC' - " TCATTTTTTT CAAGTAATAG GGTCATAA.GT r 54 0 

35 CTCATYCTYC ATATAATATG TTG AG TAT 1 ~Jr 2 AGTATATTAT GTGTTAGGCT CTGGANAGGC 600 

AGA3GTTAGA TCATGTWACA GAT QATAT 2 K GATTAGGCAG ATAAACAGTA TTTTAACCTT 66 0 

TTCCTTATTA TATGTAACTT GCTTTCAGGT TTTTTAATGT TACT ATT ATO TCTTTAATAT 72'! 

40 

ATTATCTTTA TTTGTACTTT TGTATACAGA GTGATTTTCC TTTTTTAAAA AAAATTGTGT 7 8 > 

CTTTAOOATG OATTCCAAAG ATGTGGAATC AGTAGGTTTA AGGAATAT' v;: AT ATT'T rGCC P. 4 .) 

45 TGGCAAGGTG CCTCACACCT GTAATCCCAG CACTTTOGGA lOCCIGAOGIG OGTGGATCAC 90 

CTGAAGTCAG GAGTTCGAGA CCAGCCTGAC CAACATGGCG AAACCCTGTT TNTACTAAAG 96 ) 

ACACA-rWWAA AATTRGCCAG TGGT3GTGGC ATGTGCTT3T AGTCCCACTT AGCTACTCGA 102 0 

GAGGCTGAG3 CA3GAGAATC GOTTGAACCC GGGAGGCAGA GGTTGCAGTG AGGCAAGATG 108 0 



50 



GOACCTCTAC AC 



1094 



60 



'TEE I ST 
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'A) 1^=2 :-~r.: 1321 base pairs 
'2) GYG1: nude:: acid 
. 3 ) 3T7A: IT. ELNZS 3 : d ^ ub 1 e 

5 

xi; 3Z;77G;GZ lEGGRIPTICN: SEQ ID NO: 43: 

10 CGTGTGG7GG 7G7GGAGA7G 7GTOTAGTGG TCTTA(3CTCT TTGTTGAGCT TG'TGTGTGTG 110 
TTGTGTAGTG ETA 377^ G7 A7 3-3*"?' GAAAFTG GGCGTiSTGTT G3AG03GTTG TTAGCTCTTT 180 

irTcrrr-iAAr-G gtceg'.vigg ggggaattaa gtac-qggaaa ag^tgtttcia gggttggtag 3 go 



15 



G GIG 3777*77-7 AGAACAGGAA AATG3GTGAA AGCCTTC-CTT CO " AG3AAGC TG03GCTGGT 3 GO 

20 TCGCAGTGGG 7GG7GGTGC3 3'3TTCGT3GT TCTTAT'TTCA AG3GAGAGGT TGT'GA^TTTG 42 0 

G 3 3TGLAA.3 3 T GA7G3T~3TAG 7T-3G3GGATG TGCTTG3GCT GTTCCA3CAA GTG AG AGAA 3 54 0 

25 

GTA.C77TAGTC 7TG7AC3*7GG T'GTTC AGG G A G3TGCATTAA CAGACCTCCC TAGAGGTGTA 60 0 

GGAACTACTG 7773AGAG3T 3AGG7AAGGG aATTTGTCAG GTrATTTGGA GAACAAGTG3 60 0 

30 T7TA377A37A 37777 AAAGTA GTAAGTGGTA 1 3TGTATTT AG TGG3GTG3AA TTCAGAAGAA 7 2 0 

ATTGGAAGAC G--GA7GATC-G 3GG3T37-GGA TGTGAATGAA CAGGAATGAG CGGGACAG7C 7R0 

TGX3 377GTCA.T 7GC7TT3TTG 7T73C7GATTT 'GGACGGTTGT GTG3 C G 7T AG ATTTTT^ 7TTT 34 0 

35 

CTC ! GA7X77 AG GA7GA7GGAG GAGTG7A7TT ATT A\ G TT AG GAAGAGGAr_A AGTAAAGGG3 900 

40 GTA7GCTA7T 7XXLAAATC G C TAACAGAATT GAGTTTTCTA TTAAGGATCC AAAAAGAAAA 1020 

AGA.--AATG-GT AATGAAGGGA TCAGTGAAGG GTCAGATGGG AATAAACAAT AAATTTTCGA 103 0 

GAAjGAAAGGA AA7GGAACTA GACAAATAAA GTAGAGGTTA TGAAATGGTT GAGTAA-3GAT 114 0 

45 

GAG7TTTGTTG TTG^TTTGTTT TGTTTTGTTT TGKTTTTTTA AAGAGGGAGT GTGGGTCTGT 1200 

CAC72CAGGGT GGAGTGGAGT G-GTATi2ATCT TOGCTGACTG TAAGGTGCG3 <7TCCGQ3GTT 1260 

50 C AA-'GCC ATTG TGGT'3CGTGA GTCTCGTGAG TAGCTGGGAT TAC A' 3GTGGG TG2GAG3ATG 132 0 

G CTGGCT AAT TT77TGTGTTT TTAGTAGAGA CAGGGTTTGA CGATGTT03T 2GG3GT-3GTG 138 0 

TGAAACTGGT GAGGTCTTGA TCCGCGTGCC TTGGGCTG GC AAAGTGAT3G 'GATT AG AGAT 1440 

55 

GTGAJGCC.-. G G GGTGGCCTAG G GAAG3ATG A GATTTTTAAA GT AT* 3TTTGA GTTCTGTGTC 1500 

ATGGTTGGAA GAC.-JG.AGT AG GAAGGATATG 'GAAAAGGT'IA TGGG3AAGGA GAGGTGATTC 1560 

60 ATGGGTC7GT GAG.7TTGAGG TGAATGGTTG CTTATTGTCT AGGCGACTTG TGAAGAATAT 162 0 
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GAGTCAGTTA TTGCCAGCCT T : OG AATTT A. C 7TCTCTAGCT TACAATC-GAC 
OGAAAACACC TTGTCTOOAT TCACTTTAAA ATGTCAAAAC T AATTT 7 T AT 
T ATT TT C AC A TTGAAAAAAA AAAAAAATTT AAAAACYC GG GGGGGGCCCS 
NGCCCCTAAG GGGGGG03TT T 



AATOTT 

CCAATT 



1630 
1 740 
1800 
1321 



10 



15 



30 



35 



40 



45 



50 



< ;> INFORMATION FOR SEQ ID NO: 44: 

■; i ) C rgURNCE C HARACTE R I STIC C 

i A) MHOGTH: 4 base pairs 

■; 3 ) TYPE : nuc _ e i c ac- 1 d 
(C; 5 r r RAT JDE DNF, 3 S : double 
{ D ) TOPONOGY: linear 

(vi) OEQUKNCE DECCRI -'"I'lON : CEQ ID NO: 44: 

go;ggcacagt tgaagaa:.acg accgack^gac tijggagtcgt tagtgaggat gacgcggcat 

GGCAAGAACT GCACCCJOAGG GCCGTCTACA CCTACCACGA GAAGAAGAAO GACACACtCGG 

cctcgggcta tc^;acccag aacattcgac tcagccosga tgccgtgaag gacttcgact 
gctgttgtct ctccctc-oag ccttgccacg atcctgttgt caccccagat ggctacctgt 
atgag0gtga q^ocatcct ; gagta< oytcc tgcaccagaa 1 0 aa 1 0g ac iatt gcccggcaga 
t jaac ago: t a c g agaa > z ag c g 3 a a a c c c ggcgcgasga g-c a gaagg ag c tt cac ~*c ggg 

C 3GCCTCGCA GGAOCATGT 5 CGGA'A 0TTCC '0 A 0 AG AA A j A GTC AGCTATC CTGAGCCGGC 

coctcaaccc tttcacag: : aaggcgctct c-gggcaccag co:cagatgat gtccaaoctg 

'UGCCCAGTGT G3GTCCTCCA AGTAAA3ACA AGGACAAAGT GOT ACCCA<AC TTCTGGATCC 
CGTCvOCTGAC GCCCGAAGCG AAGGCCACCA AACTGGAGAA GCO^TCCCAC ACGGTGACCT 
CCCCCATGTC AGGGAAAO"Cr OTOCCrOATGT C'AGACCTGAC GC rCGTGC AC TT rAGACGGC 



1 J A CO' GC-A CAGCCT j~-v^r_ 
TGGTCACCCT CGAATGCGTG 



\GAAGCTGA TTCGGAAGGA CATGGTCGAC CCTGTGACTO 



60 
120 
180 
240 
300 
360 
4 20 
4 80 
54 
600 



GAGACAAACT CACAGAOCGC GACATCATCG TOCT<:AOAOCG 3GGo;;GTACC G^TTCGC 3GG 



84 0 
90 0 



M) 
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(2) INFORMATION FOR SEQ ID MO: 45: 

< i ) SEQUENCE ' 3HARACTE P. I ST ICS : 
5 CA) LENGTH: 983 base pairs 

{ B ) TYPE: nucleic acid 
; C ) ST RANDEE'NESS : do ub i e 
(D) TOPOLOGY: linear 

10 ( x i ) S EQUENCE DESC R I PT I ON : 3EQ ID NO : 4 5: 

CGAOAC03CT OOGAGAVJAC GACAGAAGGG CCCGACCGCG AGCCGTCGAG GTCTCAGTGC 60 

TGTGCCCC :C CCAGAGCCTA GAG3ATGTTT <3ATGQ3ATCC CAGCCA03CC GGGCATAOIA 120 

15 

OSCCCTG^GA AGAAGCC iGA GOTGTATGAG GAAGTGAAGT TGTA TAAGAA CGCCCG3GAG ISO 

AGGGAGAA3T A "GACAAGAT G3CAGAG:TG TTTOC ; 3GTGG T3AAGACAAT G7AAG"C'rG 2 40 

20 GAGAAG03CT AOATCAA3GA CTGTGTCTCC CCCA3CGAGT ACACTGCAGC 'TTGCTCCO^G 3 00 

■ :tcgt > j r oo aatacaaag: tgzcttca > ga< 3gto 2agg got zagaaat ■ :ag rrcTATT 3 6 c 

GACGAATTCT GC03CAAOTT CCGCCTG3AC TGCCO3CTQ0 CCAT3GAGCG GATCAA3GAG 42 0 

25 

GACCG3C3CA TCACCATOAA G3ACGACAAG C^GCAACCTCA ACCGCTG3AT CGCAGACGTG 480 

gtstoctct tcatcag-'jgt catg3acaa3 gt3CGgct<3G agatgc^:gc :AT03ATGAG 54 0 

30 atcca3c 3cg ac3tgcgaga g3tgatg3a 3 accatgcacc o2atgag0ca cctcccac7c 600 

gactttgagg gozgccagac g3tcagccag t302t3caga ccctgag2gg catgtcg3cg 660 

tcagatga3c tg3acga"""tc acaggtgc3t cagat 3ctgt tcgacct3ga 3tca3cctac 720 

35 

AAG3CCTTCA AC3 3CTT:CT GCATGCCT3A G3CCGG3GCA CTA3CCC2TG 2AC A 3 AAG 3G 780 

CAGAGTCTGA G3CGATG3CT CCTGGTCCCC TGTCCGCCAC AGA3GCCGTG GTCATCCA7A 840 

40 CAACTCACTG TOT 3CAGCTG CCTGTCTGGT GTCTGTCTTT GGTGTCAGAA CTTTTGGGC3 900 

GGGCCC 2TCC CCACAATAAA GATGCTCTCC GACC TTCAAA AAAAAAAAAA AAAAAAAAGR 960 

KGSGGCCGGT CCCCANTCCC CCC 983 

45 



50 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2421 base pairs 
{ B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

( D ) TOPOLOGY : 1 inear 

(xi; SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
CCGGCTGATC GCTGCCGCTC CGCCAATACA ATAGAGCCAK CCACTACCAG CAGCCTGGCC 60 
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ctctt3otcc ttctccagag agaccaatcc agccgaactc ggggtttgcc tgaggagaag 120 

gaggaa3tga c c atg 3 ac ac aa< 3 tg aaaac ag a 2 ■ 0 tgaa a atgatgttcc agaa0ct3 3 3 130 

atgcctattg cagaccaagt cagcaatgat gacc3c3 3g3 agg33agt3t tgaagat3a 3 2 40 

GAGAAGAAAG A3AGCTCG3T GCCCAAATCA TTCAAGA 3-3 A A3ATCTCC 3T T3TCTCA 3 2T 3 00 

ACCAAGGGGG T 3CCAG3T3G AAACAGTG AC ACAGAGG3G3 GCCAG03T3G T 2GGAAACGA 3 60 

CGCTG3GGAG CCAGCAOAGC CAC2ACACAG AAGAAACCTT CC ATCAGT AT CACCACTGAA 4: 0 

TCACTAAAGA G3 TTCATCOC CGAOATCAAA CCC3TG32G3 G0CAGGAG3C TGTTGTGGAT 4Rn 

15 CTTCATGCTG A TG ACTCTCG CATCTCTGAG GATGAGACAG AXGT.W':>:, C 3ATGATG3G c .4'» 

ACCC ATGAC A A 3- >G3CTG AA AAT AT' 3CCGG AC A 3TC ACT C AGC IT AGT A 3 3 T 3C ACAG3Gv : 0 0" 

CAGGA3AATG GOCAGAGGGA AC iAAG AG 3 AA 3AAGAGAACG AACCTOAA5>3 AGAACCT23T fb" 

GTACCTCCCC AC-3TGTCA3T AGA3GTG3CC TTGCCCCCAC CTOCAGAGOA T3AA3TAAAG 12 1 

AAA3TGACTT TA3GAGATAC CTTAACTCGA CGTTCCATTA GCCAGCAGAA GTCC3GAGTT ~8" 

25 TCCATTACCA TT3ATGACCC AGTCC3AACT I3CCCAG3TGC CCTCCCCACC CCGGCrGCAAG R4 J 

ATT AG? AAC A TTGTCCATAT CTCC3VATTTG GTCCGTCCTT TCA 3TTTA 3*3 CCAOOTAAAG c <0>" 

G AG TT3TT' 3G GGCGCACAGG AACCTTGGTG '3AAGAGGCCT TCTGGATTGA 2 AAG AT 3AAA ^60 

30 

TCTCATT02T TTGTAACGTA CTCAACAGTA GAG3AAGCT3 TTGCCACCCG 2ACAGCTCTG 102- > 

CACGGGGTCA AATG3CCCCA GTCCAATCCC AAATTCCTTT GTGCTGACTA TGCCGA3CAA 1330 

35 GATGAGCT'3G A1TATCAO03 A3G2CTCTTG GTC/IACCGTG CCTCTGAAAC TAAGACAGAG 114-"- 

GAGCA-3GGAA TACCAC33CC CCTG3A2CCC CCACCCl^rAC O. X'CGGTCCA 3CCACCACAG 120' ) 

CACCC 3CGGG CAGAGCA303 G3AG2A3GAA CGGGCAGTGC G^v3AACA0TG 0O2AGAACG 3 12 6') 

40 

GAACG3GAAA TGGA03G33G G3AG2G3ACT CGATCAGAGC GT3AATGGGA TCGGGACAAA 132 3 

GTT2GAGAAG GGCC'333TTC CCGATCAAGG TCCCGTRA0 3 OXGCCGCAA G3AA0CTGC3 118.' 

45 AAGTOTAAAG AA\A;AAGAG TGAGAA 3 AAA GAGAAAGCOC A3TAGGAA3C A3CT3CCAA3 144 ! 

^TOOTOGATG AC^TTTTCGG AAAGAC 0AAG GCAG3TCC2 V C-3ATCTATTG G2TCCCACT3 150: 

ACTGACAGCC AGATCGTTCA GAAAGA3GCA GA3CGGGC03 AACGGGCCAA G3AG3G3GAG 156':' 

AAG3G3CGAA AO 3 A 3C AAG A AGAAGAAGAG CAAAA(3GAGC Q3GAGAAGGA A3CCGA3<3G3 162 0 

GAACGGAACC GA3AG2T(3GA GCG AG A :3AAA CGT2G33AC-C A3AGTCGGGA 3v\G3GACAG3 1530 
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A'.'XJTACCACX: CACTCG3CCC CA3GGGGTTA TGGCCACAGA G3GATAGGCA. CAGTCTCCAC 132 2 

CACCCTQjAG CCAAG3GTCT TTCACATCAC CTATCCCTAC ATACATACCA .\ATGGAAAAG :3 30 

5 T<3GCCATCCT TTTCCCCCCA AACACACCCC CTTAA2CTAT CTCTTG3GAC TTAGCCCGA2 2 340 

CCTCCCTCTC ATTTCCCATT AAGTCTGAGA (3C-CAAGAGCT AGGTTAG3CA AGGAC-GTGGT 2 2 00 

T< 3GC ( 2A' 2 AG A T(3G3GAACAG CCAGGTGCCC <2AGTCCTCTG ATTTTTCCTC CATCCTGCTT 2 250 

10 

AC2AGCT2CC TGOGTACTTA CA3CCTT2TC TTGGGAACAG CCGGGGCCAG GACTGGGTCA 22 2 0 

CCTATGAG2T GAAT3AGCAT CTCCTCCTGA GTCCCAGGGC rCCTGCAGTT CCCAGTCTCT 22 30 

1 5 TCTGTCCTGC AGCCCTTGCC TCTTTCCCAC AGGTTCCACT TTATATCCAC CTTTTCCTTT 2 2 4 0 

TGTTCAATTT TT ATTTTT AT TTTTTTTATT ATTAAATGAT 3TGGTCTATG ■ 3AAA AAAAAA 2 400 

TAAAAATCTG ACTTAGTTTT A 2421 

20 



(2) INFORMATION FOR SE2 ID NO: 47: 

25 

(i) SEQUENCE CHARACTEKISTIC3 : 

(A) LENGTH : 84 0 base pairs 
{ B ) TYPE : nucleic acid 
(C) STRANDECNESS : double 
30 (D) TOPOLOGY : linear 

(xi } SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

CTCAAACTCC TGAGCTGAAG CGATCTACCT C-CCTCAGCTA GGATTACAGG TOTGAGCCA2 5 0 

35 

03CACCCAAC CTCAATAAGC KTATTTGATA AAAKATATGC AAGCTCCCTT TATrCACTTT 22 0 

TGATTCAGAA TGTTTAGTAA TTT 3TATTGT TTTTCAGATT TTCAGCCCAA TAT AT CTC CY 230 

40 TGCCCACTGT GTCACTGTAT TCTACCTAWA CATCATCAC G TGTTTCTGCT ATT< ~* 3 1 2TGT A 240 

T3ATGGAACA CT3CGGCTCA TTTTCCTGAA A^CTGCCGAT AGTGCATAGA RTGCTGGGA7 3 00 

GGAAACCAGA ARCTTTGAAT TCAA2CCTTG GTTCTGCCTT GTTTTTGCTT GGGTGOtCCTT 3 50 

45 

GAGTCAGCCA CATACCTTTT AAAATCTCAA TTTATTAGAA ATTATTC2AA ATC AAAATCA 42 3 

AATGAGAAGG TATATACAAA AGTjCTTTAT CCCACAATAA ACTATTCAAG AGA2AG2AA-A 430 

50 GGAGAGGACA TTTACTCAAC ACCTGCTAAA A3GCAGCCAG TGAAATTAC-G CATTTTATTT 54 0 

AATCCTCCTG GCAACTCTGA GAGTAAAGCA TTATTAATCC (2ATTTTGG2T GTTTAAAGAA 500 

ATTATTTGCA CTA<3ATTCCA G2TGTAGTTT AGYTTCAGAA AAAAAAATC C TGAGATGTGA 560 

55 

ATTCACAGCT TTCTGGGTTT AAA' G2 C C AAG CTCTATCACA TCATGCTATT ATTGTTACAT 720 

TACTGCTAGT TCTATGAAAA GAAATACTAA TTTATGAAAT ACATCTTATC CA£J\AAAAAA 7S0 

60 AAAAAAAAAC TGGGAGGOGG GGCCC 3TACC CAAATCGCCG GATAGTGATC GTAAACAATC 340 
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5 (2) INFORMATION FOR SEO. ID NO: 43: 

( l ) 2 EC -UEN< E C H ARACTE R I3TICS : 

(A) LENGTH: 24 3 2 baje pairs 
(3) TYPE: nucleic a~id 
10 { C ) 2TRANDEDME3 S : d :>ub 1 e 

( r ) TOPOLOGY : 1 inea: 

(xi) SILENCE DESCRIPTION: SEO ID NO: 43: 

15 o3ca<:gag:;o c go •< iaac c xn ' gag^aagggc ccgtogcgcc ttccccggcg cgccatggag oo 
ccccgggcgg ttgoagaagc gotogagacg (XT'Ga-:x-a(}<j atgtgattat ggaagctctg 120 

CJGTGATAOA AC 2AGGAGCA GT : 2 2AGAGC TTCACGTTTG ATGATQ2GGA ACACJGAGGAC ISO 

20 

2G2A\GAGAC TGG2G2ACTG CT2GTCTCCG TCCTO2AA0A GGGCTT2CGA CCC2OGCV02 240 

GTGTCAT2TG G2TGCAGAGT GTCC3AAT2C TCTCC 70Q;;A CCG2AACTG2 CTGGACCCGT 3 )( 

25 TCAG2AG2CG C 2 AG AG 22 TG CA2G2AYTAG CCTGYTATGY TGACATCTCT GTCTCTGAGG 3^0 

3GTCCGTCCC AGAGTC2GCA GAGATGGATG TTGTACTG2A GTCGCT 2AAG TO 2 2TGT3C A 4 2 ■: i 

ArrT^GTX T CAO~A3--v-CT 3T2GCA2AGA TG;T03CA2C A3AG2C2CGG CTAGTGGTGA 4 30 

30 

AG2TCACAGA GCGTCT.a~-2G CTGTACCGT2, AG AG2J AG2TT CC2CCA2GAT GTCCAGTTCT 54-- 

TT22 j ITTO" 2 0 C T 2 GT "IT G 2 T JCTAAC O 2 CA 2TC 2 G 2 AG G O'YTOTGCOC C 22 1 AC C TCTT 6 " ) - • 

35 TCAGGAGCTG AAAGGAGTGC 3CCTGOTAA2 TG ACA2A 2TG GAGCTGACG2 TG3GGGTGAC 6 6'") 

T2CTGAAG2G AAC2CX22CAC CCACQ2T2CT TC2TC202AA GA2ACTGAG2 GGGCCATGGA 720 

CATC 2TC AAA GTOCTC rTGA AGATCACCCT G3ACTCCAT2 A A GG3GGAG3 TG3 A 2 GAGGA 7 8G) 

40 

AGACGCTGC2 CTTTAC 2GAG ACETGGGGAC C2TTCTCCG3 OACTGTGTGA T3ATCGCTAC 842 

TGCTG3AGA3 OG2ACAGAG3 AGTTCCACCG C7ACGCAC-TA A:02CTCCTGG G3AAC1TGCC :)'•" 
45 CCTCAAGTCT GGGAT * 2' G227V: T U GG ;g.- GAT - AGA^GGA G.2GAGTTGAT 

c jGgagtgaa. r atggatgiga ticgtgccot c2"lcat;t2c l 4 AuA..aa^ gtttgcacaa 102"; 

GAGAOAGAGG CTGAAGGAGA GTGTAGCTCC CGTGCTGAGC G2 G2TGAG2G AATGTGCCCG 1280 
-ATGGAGGGG CCAGCCAOGA AGTTCCTGAA C- 2A\ - ?T .1 r"GCCCCCTC 2GCGG2ATGT 11 
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ATG ACT AC AA GGAAGOCAAA GCCAGCATAA ACCCTGTGAC CGG3AG3GTG GAGGAGAAG7 144 0 

CG2CTAACCC TAT GGAOGGC ATGACAGAGG A3CA3AAGGA GC A CG AGGC Z ATGAAGCTGG 15C : 0 

TGACCATGTT TGACAAG3TO TCCAGGAACA GAGTCATCCA GCCAATGGG 3 ATGAGTCCCC 1560 

G33GTCATCT TACGTC 2CTG CA3GATG2CA T3TG2GAGAC TAT 33 AGC AG CAGCTCTCCT 1620 

CG3ACCCTGA CTC3GA3:CT GACTGAGGAT OGCAGCTCTT CTGCTC 2CCC ATCAG3ACTG 1680 

gt3ctg:ttc cagagacttc cttggggttg caacctggg3 aagcca ovrc CCACTGGATC 17 4 0 

CACACCCG2C CCCACTTCTC C ATCTT AG AA ACCCCTTCTC TTGACT7CC 3 TTCTGTTCAT 180 0 

15 GATTTOCCTC TG3TC 3AGTT TCTCATCTCT GGACTGCAAC GGTCTTCTTG TG3TAGAACT I860 

caog3t:ao: ctcgaattc: azagacgaag tactttcttt TGrcT033C2 aaoaggaatg 1920 

TGTT 3A ! 3AAG CTGCT03CTG AG3GCAG33C CTACCTGGGC ACACAGAAGA GCATATGGGA 1980 

20 

GGGCAG3GGT TTOGGTGTG3 GTGCACACAA AGCAAGCACC AT 3TGG3ATT G3CACACTGG 2 04 0 

CAGA32MANT GTK.TTGGG3T ATGTGCT33A CTTCCCAGQ3 A' 3 AAAA C C TO T3AGAACTTT 2100 

25 C CAT AC 3AGT ATATC AG AAC A2ACCCTTCC AAGGTATGTA TG:TCTGTTG TTCCTGTCCT 21*50 

GTCTTCACTG AGCGCAGG3C TG3AG3C:'3TC TTAGACATTC TCCTTG3TCC f rCGTTCAG3T 2 22 0 

GCCCA3TGTA 3TATCCACAG TGGCCGA 3TT CTG3CTG3TT TTG3CAATTA AAC CTCCTTC 2 2 80 

30 

CTACTGGTTT AGACTAC AC T TACMCMG3 AAAATGC C C C TCGTGIGACC ATAGATTGAG 2 340 

ATTTATACCA C AT AC X AC AC A T AGC C AC AG AAAC AT CAT C TTGAAATAAA GAAGAGTTTT 2 4 00 

35 GGACAAAAAA AAAAAAAAAA AAAAAAAAAA AA 24 3 2 



40 (2) INFORMATION FOR SEQ ID MO: 49: 

( l ) SEQUEI JCE CHARACTER ISTICS : 

(A) LENGTH : 1742 base pairs 
( 3 ) TYPE : nucleic acid 
45 { C ) STRANDEDNESS : double 

(01 TOPOLOGY: linear 

(Xi) SEQTJENCE DESCRIPTION: SEQ ID NO: 4?: 



50 GTCCTGCAGG AGCTGCACGC GOCCGAG3TG CGCANGAACA A 3G AGC AGOG AGAAGAGATG 6 0 

TCGGGCTAAG GGCCC03SAC GFGSG33G3C OATCCTGCGA COGAACACGT TCC03TTTTG 12 0 

GTTTTGTTTC GTTCACCTCT GTCTAGATG2 AACTTTTGTT CCTCCTCCCC CACCCCAGCC 180 

55 

CCCAGCTTCA TGCTTCTCTT CCGCACTCAG CCGCCCTGCC CTGTCCTCGT GGTGAGTC<3C 24 0 

TGACCACGG2 TT3CCCTGCA GGA<3CCGCCG ' 3GC GTG RAG A CG3GGTC03T COGTOCAGAC 300 

60 ACCAGGCCG3 GOGC3GCTGG GTCCCCCGGG GGCCCTGTGA 1 3 A3 AG3TOGY GGTGACCGTG 3 60 
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10 



15 



25 



30 



35 



40 



45 



50 



GTAAACCCAG 'GGGGGTGGGG TGGGATGRGG GGTCCTTACG 

gtogaggtoa ggggaggtgg tgtgaggggg gggcgctggc 
ggtggt gtgt ttg ga g g g g g aaggg< g r r g g ■ gg at g a< g 3 ra 
tgatg gtggg tgggaagggt gg k tg t 1 g g an ggggtggttg 
gtggggggtg cagctttcct taatgtggtt ggagaggggt 
gaggt-ggaga oggtgggggt tggtggaagg gtggagttgg 
g g g j A* gt r g g ■ g att rr gt g : < : agg ^a- re 1 g a t gag : ag g gt 

' GGA 1 G GAT G G A 1 : A' GT J JG'GCA CT G AG AA< GT G > GAA' GAGTTGG 
GGGTG GGTGG GA:AG:g:CG CC-GrCCCTCC GGAGGAGGGG 

ACC' :a:> g gt a < g» a g : : : ptt< j g a gtgct g g g czz r- g gt gt g 



CTGGGCTGTC 
GAGGAGGGGA 
<GOGGGGAG ; 
TTGAAGTGGG 
G GTGTRAGAG 
GGGGGTGGGG G 

ggatggtota 
aaagagcgtc 
ttt-ggagatg 



tggtcaggac 
ggc tacagta 

GG GGGAGGGGG3 
AGGGGGGTGG 

gaggtggggt 
tgagt gtggt 
gaggttgtgg 
tgggggaggg 

rGCTTGAAAG 



GGGA^ GAGAA 

atcgaggtgt 
ttgggaggga 
tgt ggggt g.\ 
tggaggatgg 

GGAAGG GG GA 
GAGTTGTGTT 
CGGOGAGTGGG 
GAGGGGrGAG 
G'GAGAAGGAG 
A-- rr G AG . A*\AG 
G GAAAGG IT A 
AATGAAA GGA 
GG 



TGGAAGGGGA GGGGGTAAG3 
G AG GG GGGTG TGGGTGGA'GG 
GTTGGTTGTT TGGGTTGGAG 
GGGTGGGGGA gggtgtgttt 
T2CGAAAGGT GGT3GGTGGG 

AGT GT GTTGG AG A 1 GG CAG AG 
GGGTG A GGAG GGAG GGAGTG 
C A GG G A GGTG GG A TGO GAAG 
GGAAGG AGAA GTGAGOGGTG 
GT GTAA TATG ■ GAG G G ATGT G 
AGAATTAAAT AAAGAG GTGT 



AGGGGGATTr 

CGTGACTCCA 
GG GAGTGGT 
TACCACTGC 
GGTPGGTGT 
GAA GTTG GA 
GAGGGAGGA 
GG'G.AT-GT GG 
GG AAGGGTGT 
GGATATTGGT 
ATGGAGGTAG 
oTGTPTAAGA 



TXTGGGAGG G 
A 1 G GGG G AAG A 
GGG'GTGTGCT 
GGGGTTGGAG 
AGGATGGGGG 
AAAGTGTGAA 
TGGTGGGGGA 
GGGGAGGATG 
GTGTCGGGAG 
GCACAT GTGG 
GATGGTGGGA 
TAAAGTGAAG 
AAAAAAAAAA 



GTGGG GAG AG 
GGGTTGGGAG 
GGG'GAGAGGG 
■GAGGGTGAGG 
AGTTGTGGAG 

GGGGTTGGGG 
GTGAGGGGTG 
GAA GAG AGG G 
CTTTTGTGGG 
G G ■ GGT G AT A G 
T? AAAAA TTG 
AAAAAAAG TG 



42 0 
4 80 
54 0 
GOO 
600 
7 2 0 
7 8 0 
84 0 
00 0 
0 0 1 1 

102 

108" 
114 0 
1200 
1200 
132 0 
1380 
144 0 
l^O-i 
156 ') 
162-') 
108 ) 
174 0 
1742 
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txi) SEQUENCE DESCRIPTION: 5E0 ID NO: 50: 

GiCACGAGCC TCO:-GAA':T GTGIAGTCOG ■VGGAGGGCTG GAATCAGCGT GGGCTCCAG j 6n 

5 TCGCTGGOAG CC03GTGG:A GAACTCTTCC GAG3CTCCTT 1 jGG AAGAAGC TACACCCGAG 12'.' 

g;a<:x:cggat g3G"tcgaa aaoctggccc '2ct2tggttc tgtaccattg caag^g3aac 13- 

CGTAAACTGA 07TTTTCTAA CGTGGGTTTC TQ2 2AAGTAC TTTTCCAGCT GCCCCCTTCC 24 0 

10 

CCGCAGCAOA CAGGAGAG2C TCTGTGTAG2 ZAG2G2TTGA '2AGTCGTTAG GTAG3TTGTA 3 0 f - 

CTGTGTAGGG AG3A3CTCAA GAT'OATGAAT GGTTGTCACA GiAGAAAGCG GTTG2ATCTT 36-) 

15 tggaaaaota TATACCTGrT GT03TTTGTG TTTTCTTTTC TGCTGAGTAA TGAAGTTGTA 4 2 1 

agttcaoxot ogaacattot caosgctgtg iagattattt gaactttatt TCATAG3TGR 48'- 

ataagtgott t' r r a g i ttt c tttgtatatt ga^ttgcttt tgaattgctt cccatatttt 54'? 

20 

TATTTCATAO AAACTGAACA ATTGTGGCCC CTCTATTTTA ' PTT A T AAAGG TTCA 3TGTAT 6CK 

CTTTGrCT vO CTAOATCAAT CTGCAAGG3A GTT3CAGAAA G2CTCATGTT CATCGAGCCG 660 

25 TGAGTOACAA CCAATTTCTA AGCTGTTATA ACAAAAAAGT GTTTG2TTTT TTTCACAAGT 72'"- 

AA.CTTT AAA A GTGTAGTTTA GAAAGAAAAC ATTTTCAATA AAAAGACACT ACATTAATCC 730 

TG3ATO0TTG C AAAT 2CTAA AATMTATT2C TCCTCTAGCG TTGCACAGCT CTGTGTTGTA 840 

30 

TACACAGA2T A'GCTTTAAAA TTTGTCA' Z AT ACCACTTTAC CTTTACTTTT ATGTATCATT 90'"' 

CCCCCGACTT CCTTACTOGA G3TGTGGGCA AGAAAACTTT TCCTTTAACA CTTTTGAACA 96 0 

35 G03GGCATAA AATTCTG2AG CTGAG3TGTT 3AA3AATGCA GATG3GTACA GTATGTGTTG 102'.' 

GAGCTCACAG TGTGTATTGA CTAACCTAGT TCCTTTTTTG CTTTTTTTGG TATTGTCTTG 108'.' 

TTAAAAGTGA CTCCCAGGTA GOAACTCTCT TTTTTAAGGG TGGGAACGAA AGGGAOGTAG 114 U 

40 

GAAGAATAGA TCTAGATTAT TTAACAGTCT TCGATAGAGT TTGAAAGCTT TCTTC TTC AT 1200 

TCAATTTTG G GCAAAATACT GCCTCTGCAT TTGTTCATAA CAAAAAGATT AGATTAATAA 1260 

45 GTAGCTTTTG TTGGTGGAAA TTACCAGCTC TATAAGTCAC CCTTG3TGGT TCATGGACCT 132 0 

CTGATTAGCT TGGGTTTTGC AGTCTCATTG C C ACATGT AT ATGTGGAGCC AATGG2CTTT 1380 

TGGTGCTCAG CTGTTTACGT CTGA3TCCTT 'GACTTCTTTG GTACAGTGAT GGAGTGAGAT 144 0 

50 

CTCATTAAGT GTGATTCTCC ATGGATATAA CC AGC C C CAA AAAAANG 1487 



55 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1328 base pairs 
60 (B) TYPE: nucleic acid 
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<G) STRANDECMEG2 : doufcle 
£ C ) 1 O PCLOG V : 1 i near 

ixi) G EQUENC E L'ESCRI PTICN : SEQ ID ND : 51: 

GGCAGGAGGT GGT l G3G ( 3AAT T0 l 3GGAG3AG AGAAGATTTG AAGAAGCCAG ATGCA'GGTTG 6 3 

ggt3cgog2t g.ctt 3ttgtg -333aag33aa aaagaggaag gcctgtaaga actgcacctg 120 

tg3ccttgcc gaa3aagt3g aaaaa3agaa gtcaag03aa ca< 3at3 agct cgcaacccaa 180 

gtcagcttgt go^aaactgct agctgo3gga tgg:ttgggg tgtgccagot gcgcctaggt 24 0 

tog3at32ga g3cttcaaac gt3gggaaaa g3tggttgtg agt3atag3a atottgatga 3 00 

t<g:2tag3AG gttggtgaga tgggagcgat ctgotggtcg agggaagtgg tgtgogtgag 300 

atgccagcat ggtg3ctcct cccacctcgt ctg3atttgt tga3tgtgag atgt3tttg2 420 

'g3tgtgtatg aaaagaggaa ' 3 3tatt at' 3g <3a2gt3gttt cagaatogga tgggttt git 5 '10 

GA rCTCATGT T AAGA< 3 AA 3 3 GAGT3TGT2C TGAAGAAGC 2' GTTGTTGTGA T3TTAAAAPG 600 

CTGACCAGAA CGCTCTTGAG CGGAGGCATC GTTGAGGATT AAGAGTGTGT GACAGA02TG 660 

GAGAGGGGTG G3TTGAGT3T GATCTCA3CA ATGCT3CGAC GGTGTTGTGT TT0AGA3TT3 720 

TT AG TIT AG T GGATTCTTTG T3A3AGGAGT CAAGTGGCTC ACAACCTC-GT GAGGG3AGGA 780 

GAG3AGTCAG TGAGTGGTTG ' 3 TGT 3 AT 3 AT ATGCAGTGT 3 G 3T 2TG3GGG GTTGGAT3 3G 8 4 '3 

2AAG'2AGATT TGAGTGTA-G3 ATTGGATGTG T3TGGTGTT3 T 3ATTTATGT TAA3GTTGAG 900 

GTATTAAAGT TGCTGCATAT GTTGAGATAT G T P 1 3 A 1 3 ATTC TGCATGTCTT GT AAA 3 AG A 3 960 

GG> 3 AT 3T 3C A TTTGTGT3TG ATGTTGGATA GTCATCCACG GTGAGTTPG3 AG 3 A FT G3A 3 102 3 

GAA3TTAGTG T2ACG-2ACAA ATGGGGCTAT TGCTACGCTT AGAATAGGGC TTGTGTGGGG 1080 

AGTTTAGAAG AGT3C 3AGGT T 3GT3AG3AT TTA3AGG3AA '3GAGG3GAGA A2TGT 3AAGG 1140 

A2AA2A3 3TC TGTGT3AG0A GAGACCG3TT T3TTGTT3TT AT3 2AGGGAT ArOGAGT?G3 12 00 

AATGA^PCTT GGGAAATATT 3GGAGAGA1T GTGTG3ATTT AA3AGAC 2'TG GATTTTTATA 1260 

TTT TAG GAGT AAATAAAAGT TTT 3ATT3AT ATGTGTGGTT GAAAAAAAAA AAAAAAAAAA 13 2 0 

AAACTCGA 132 8 
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GAATTCGGCA CGACCT'TTGC AACATTA-_- 
TGCT 1 3TCTTC AATTAAArCA TTTA'TG.-.C' 



TTTG-rCACT'j TTATTAGTTC 

ggtt*:k>:a.tt gatacagtaa 
tgtt^cttaat gac 



TA 



AATAGG ATT A TTTTAAGTPA 




ACAATTAAAT ATCACACTAT GACAT-. 



60 
12 0 
180 
240 
3 0 0 

3 00 
410 

4 H 0 
540 
600 
660 
72 0 
78 0 
840 
9 0 0 
960 

io;:o 

1080 
1140 
1200 
1260 
1320 
1380 
144 0 
1500 
1560 
" 1620 
1630 
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3 1 ? 

GCrGGTAAATG TTT ACTTO AA AATGACTCCA TATTTCAAAT ATOTGTTTAG ACTGTGAAGG 174^ 

GCAAATAATT TTTAAGAAAA C ATTTGAAG A GTAGTGTGTT TGGATTTGTG AAT A A T CTT A 13 0" 

5 CTCACAGCAA GTAAACGTAA TAAAAGCGAA CATTTAAGCC AAAAA;G\AAA AAAAAA IBS 6 



10 C) IIJFORMATION FOP. SEQ ID NO: S3: 

{.l) SEQUENCE GHARAGTEPIGTIGS : 

( A ) I , KNGTH : 15 5 3 base pa l r :-s 
(13) TYPE : nucleic acid 
15 tC) GTRANDEDNEG3 double 

(GO T0POL0G7 : 1 mear 

( x l ) G LQIIENC ■■: DE SC R I PT I C )M : 3 EQ I D r JO : 5 .■! : 

20 tcxgottatcoa ttoctgiiaat tactttactt ac;gataatgg CCTCCAGCTC CGTCCAAGTT 6 0 

GCT'GCAAAAG GTATTATTTC G7TOCTTTTT GT O 2tC TCLVG T AGTATTCCAT (yGTGTATATA 12 0 

TACCACATTT TCTTTATCCA GTCATT r X'M T GATGGGCAGT TAGGTT03TT CGACATCTTT 13*") 

25 

GCAATTGTGA jTI'GTGCTCjC TCCAGATATG ATCTTTAACT CrTTTGr/CTT CTCCACATAC 2 40 

ATTTCCAAGT 2'CTGTTCATT CTACCTCCAA AATGTATCTT GT^TCCArTC ATCTCTCTCC 3 0 ) 

30 ATCTTCAATC IATTTCAATG CCCCAT2ATC TOTTGCAT GG AG 2> ACT G T K\ P AA TTGGC T A 360 

ACTGGCCTGT TCTTACATTT TAAAAT'OAAA AGATGTGAGA G0TGAAATO7 CTATTTCAGT 4 20 

GTCCATTGAT GGTTCT GGTT ACA.7AG2AGG T GGGTGC G TG GTOTCGCAOT GGO AGAGTT " 4 8:) 

35 

AQ2AGTGTGA AAAAGAGTGC TTGGCC ZTTT ACAGGGAAAG CAGGT2CACT JTCyOCCTGTG S40 

AO j AO' 3- AG AG OTOTGGGCAG GCTOGGACAC TGGCAGACCC TOG TO OTGGG TGGCCAAGO? 6 00 

40 AG7AGGGTAT GTGTTT2GGG T2AGTGAGAG G3CTCAGCAC 2A0TCCTCAT ■ GGC TT C CTT A 660 

CTGTTTCOGO AG AGOG TG AC 7CGCOGCTGA TTGAGTCCCT 07 OGGA GATG 0TOTCCATG3 7 2 0 

OGTTCTCTOA TGAACXVG'GOC TGOCTCAGCA GGCT GCTOCA GACCAAGAAC TATGACATOG 7R0 

45 

OCACCTCTTC TGCGTGCCCC TCTTG'T GTGT GATAGTG'GTG 1TAAGCTTGC G T A G AATT 1 2G 90 0 

50 AGGTCTCTGT AOOGGCOVGT TTCTCTGCCT TCTTCCAGGA TO AGGGGTT/ ^ GGGTGCAAGA 96 0 
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TTGTCAGCAG GCAGG2TO3G G AGGC C AG TG TTGTOL-GCTT C CTG 3 T 1 CGGA CTGAGAAGGC 1320 

TCACGAA3GG CATC3-SCWAT STTGSTTTCA CTGAGAG2TG CCT3CTG3TC TCTTCACCAO 13 80 

5 

TGTAGTTGTG T2ATTTCCAA ACCATCAGCT 03TTTTAAA\ TAAGATCTCT TTGTAG2CAT 144 0 

OCTGTTAAAT TTGTAAA3AA T 2T AATT AAA P3GCATCAGC ACTTTAACCA A\2G\AAVAA 1500 

10 AAAAAAAAAA AAATJAAAAAA AAAA30G302 CGOT2TAGAG GTGG.AAGTTA NGACGNOS 1 5 5 P 



15 (2) INFORMATION FOR GEO ID NO 1 : 54: 

(l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 94- to^e pairs 
( r- ) TYPE : nucleic a c i d 
20 (C) STRANIEDNEES : double 

( P) TO PC LOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEP. ID N2 : 54: 

T AAAAATC AT OCTCTGTACC ATGGTGACCG TAGTCATCAT CATCGCOSCG CAGA0CAO3A 60 

GAAI'TACTCtG GATCCCTAAA AA3G2CC3TO GTCOSOSCCC AITST'SOSCC GGTCGATCTC 120 

CCA02CTCTT TCTCCA'JWCA TACCGC3GAC 0CAATO3GCG CCCTOCACAC OCOTTTCT3G 1 80 

GSCCGT2AGA CTT'SGATACA TCGTAA\CTC OG3 3TCCACG CAA3GTCTCG CCTKG3GAG3 24n 

AAGMTC G3 A\ TGCAGTTGCT CASGAAC SG S TCCAAAACC 2 ACACCCCGAG G3A 3GCCG2T 3 00 

TTCCGG2ATC 2CGGGCAAAC GCCG3AC3CT 2AGTCG2TCC ASGC3CCCT2 A3CCTCAAAG 3 60 

TGTAG2-G2CC 2 2AAG2-2AGC AACCTCG3TT T3GT3C 3TAA AACCC 2G2CT 3CT2TATAAG 420 

CACCG2CCCA -ICTCTGACAA AA3CCCGCCT CCAO3TC03GC AGGCTC03CT T3TTTTCTTC 480 

TCC ( 2CGGG3T GATTCAGTCC AGTGATTGG3- TTTGTG3CTC CAG3CCTCGC CCACA3AC3G 540 

ACAGACCCCT CCCTTTCTTC C 1 3G 2AAAAGG ACCGAG3CCT G3GGTAGTAA G3SCCCCACA 600 

45 GTCCTGTTTT TTGCAAGTAC A ITTTTGTCC YTCCTOSACC CAGGTATCTG CCTATTTTCT 660 

TG2TAATCC2 AGAACCTTTC CTTTT 1 3CTTT TTTTAA3GA 3 ATTT2GGAAG 'PTC 3TG3TGT 7 2 0 

A3GACCCTTC TGCCTOG3AT AAG AAAG CTG G 2TGTAAACG CTCTGTAAYT ACTCCCTTCC 780 

50 

ACCCAT2CCA GC0CCTO3GC AG3CGGG2AS AAGGGAATCC AG3CTATOGA CCTCCCAAGT 84 0 

CGCCGCTCCC GG2TCCCGTC GG2GGCCCCG CCTTGTTCTG ATCTGTGTGT GAGTGTGTGT 900 

55 GAACTTCTGA AAG AC AAT AT TAAAGAGACT TAGTTGAAAA AAAAAAAA 948 



25 



30 



^5 



40 



60 (2) INFORMATION FOR SEQ ID NO: 55: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: l »90 base pairs 
(E) TYPE: nu-:loic acid 
5 (C) STRAICDEDIJEGS double 

( D ) TOPOLOGY 1 inear 

(xi) :J)EQ'JEN2E CESCRIFTIGN: SEQ X ID NO: : 

10 fkGOGAACTOC AGTGACAG2A GGAGTAAGAG TG(3GAGGCAG CiACAGAGCTG GGACACAGGT 60 

A'^OGAGAC^GG GGTTCAGCGA goctagagag G07AGACTAT CAO0GTO2CG GCGGTGAGAA 12 0 

TCCAGGGAGA GGAQ7GGAAA C A J AAGAGGO GO A< iAAGACC LlOl^ArTTG r I\70STTG2AG 130 

15 

AG7CCCTCAG CCATGTTGG* „ AG ;< 72GVG2CA C2iCTOGCTAC CACGTCCl'CT ACACAGT 7CC 240 
GO'JCTGCCCT TOOTTCTGST 07TTCTGC;C : CT'3G3«3GCCG CX3TC*GGC 2CA GGAGGGGTCA 2 00 

20 G A G 7 7 0 G T C C T ■ G 7TO">A GO 0 < G j aG 7 *l a G : 7 1 j i O 2 I G '"I \_»TO AGO" 7 TOG I C G AG*. 1 1 OC Y \y_ a 

oogggocccg ogggagcag2 7Cto';gagag 02accccctg c;gcgagtgo2 atttgytgcg 420 

gtccgaagcc a2caccatga gccagcaog3 3aaaccg;gca atggcacgag tgog gg cat 2 4 so 

25 

TACTTCGACC AG3T2 2T03T GAACGAGOG2 20TG3CTTTG ACCGGGCCTC TCX>2TCCTTC 54 0 

GTAG2CCCTG T2COGG3TOT GT A 1 2AO 7TT G O GG TT G CATG TQ3TGAAGGT GTACAACCG7 600 

30 CAAACTGTCC A'OGTGAGCOT GATGCT'GAAC A2CTGGCCTG TCATCTCAGC GTTTGCCAAT 660 

GATCCTGACO TGAOC 0CO2A <;GCAG30ACC AGCTCTGTGO TACTGC 2CTT GGACCCTGGG 720 

GACCGAGTGT CTCTGCOO 2T GCG 2 70GOG3 NAATCTACTG < 3C PGGTTGGA AATA7TCAAG 730 

35 

TTTCTCTGGC TOTCd CAT IT T2X7CTCTCT3 AA'GGACCCAA GTCTTT2AAG CACAAGAATC 34 0 

CAGCCCCTGA CAACTTTCTT GT2GGGTCTG TTGCCCCANA AAGAGCANAA GCAG3ANANA 900 

40 NACTCCCTCT GGCTCCTATC CCAC2TCTTT G7ATGG3AAC 7TGTGCCAAA CA C CG AAGTT 960 

T AAG AAA AAA ATAAAACcTOT GO-C AT7TCCA 9 90 



45 

(i) SEQUENCE C!iArG\CTERTGTICG : 
50 (A) LENGTH : 16 C 3 base pairs 

l. B ) TYPE : nucleic aci cl 
CO! 0TRANOED:TESG : ic ill: 1^ 
( D ) TO POL 2GY 1 1 ne a r 



()() 
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(2) INFORMATION FOR SEQ ID MO: 57: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1352 base pairs 

(B) TYPE * nucleic acid 

(C) STRAI IDE DUE SS : double 
60 { D ) TOPOLOGY: linear 



CCGCCGTCCT AGCCGCTGCT GTCTTCGTGG GAGGCGCCGT GAGTrCCOCG CTGGTGGGTC ISO 

C<3GA<2AAT2G GAGCAGCCGC ACATTGOACT CCAGAACAGA GACGACCCCG TC\-^CA"A 240 

ACGATACTGG GAATGGACAC CCAGAATATA TTGCATACGC GCTTGTCCCT GTGTTCTTTA 3 0 L 

TCATGQ2T2T CTT'P >GCGTC CTCATTTNGO C\MC r 2NGCTT !^AAGAAGA2,A ( JGCTATCGTT 2 60 

GTACAA' 2A- "A AG 2 A oA< 2 2 AA GAT ATO 3AA< 5 AAGAAAAAG 2 TT< 3 AAAAGCT AG RATTGAAT 42:' 

GAC AGTGTGA ATGAAAV ; A 2 T( J AC A' JTGTT G G y 2AAATC 2 T< 2 :A< 22 AC AT CAT 2AAAV\T 4 r ' 

GAAGCGAATG ( 7T2ATGT"fTT AAAGGC 2ATG GT2a6;OAGATA A 2A2C' :TGTA TGATCCTt ;AA 54-1 

15 AGCCCCGTGA CCCC IA02AC ACCAGGGAGC C2>2CA2TGA GTCCT 1 ' yGGOT TTGTCACCAG 600 

GGGGGACGCC AG03AA02AC 2'2CTGCGGCC ATCATCTGCA TACG3TOGGC 2GT 3TWGTCG 66'') 

AGAGG2ATGT GTGTCATCOG TGTAG22ACA A2C2GT3GCA CTTTAT/AAAG <2CCACTA\CA 72 0 

20 

AGTCCAGAGA GAGCAGA 2CA CG2CG22AAG G2 3A2GTCAC G2TC 2TTTCT GTT2G2AGAT 7 B0 

TTAG AGTN A 1 " AAAA ( 2T : y 2 A 1 C AC AA jT< 2 AA ACC A 2AAG2A A 2' 2G A< 2 AAO 2 - 2TGATGT 2TG 84 0 

25 TTAG TO 2G G 2 TGAAAC2 2T2 AAT303GAGG T2C7G22AAC A 2 2TGTGAAG ,\GA2AACG*2A 900 

GTQG2A2AGA GTAGCA2GTG A 2CCGTGGTT TTGGTGACAT TG:;G02CAGA GTGGTGCA2G °60 

G TG AGG AG AA GGTACTTGGA G2CTCCCAGG TGCTGTGGCA G2ATA2GAAT < yGT ATTT 2 AC 1020 

30 

AGGGAA3T3G GAGAG2TTTC C FTGACCCAG GAAGACTGAG GGGGAGTGAA CATGATTA2T 1030 

TGTCTGCCTA 2AGCTTCTTG TAAAGAAGTC ACAAACTTAG TGCCTO :AC-G '2GOTTOG2TG 114 0 

35 TGTGATAATG AGG AT AG AGG ATTACTTGTG AGGCAATGTG GCATGG'IGGG GATTGTGGCA 12 00 

AACTAGAATT CACATCACCC A 2C AT AT AGG GCTTGCATTA CC ACGA 1 2GCA GAAAGCA2CT 12 60 

AGTGTTGCTG CATCTTCTTA CG2AAAAAAG ACAAAATCCA G ACTT C T AAA ATGTAAAATC 132 0 

40 

ACTG ATTTTC GATATTCK2CA G2TTACTTTT TTTTTTTAAA CAACCATGCA GGCCAAATGA 13 30 

CTTGTAATCT TGTCACCATT TTT AGGT AAA CTGTGACTTG AAAAAGTCTG GAGCAAACAA 1440 

45 ACCAATGCTT TTTCCTTTTA TTCTGTTGGR AAC ! 2AG TTTT CTTTGTGTCA CA2TTYTGAA 1500 

ACCTCAATAC GAATATTTCT CTTCCCACCA AATATTTTGA GGCAATTGAA AAGCCACAGT 1560 

GATTTATTTC TTG ATT2 "GGC AATTTTAATT TTGCAAGACA ATT 1603 
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( x i ) S EQIJENCE DE3C R I PT I ON : SEQ ID N 2 : 57: 

TACAG2TCAG GATG2CTGTA A2ATTGTCAT CTCTG2GCTT CTGG2TCCTG CTTA2C 2T02 60 

TITTT- 2< 2CTG G A G 3 ACT 2 AC CAGG2 AT 2CG 2CCCAG 2AA ; AT 3TTACTAA AT C ATA 1 IT 2T 12 0 

octcc::tacc tttg:ga2A2 2T2tcactcc TG2CTG2TGT T2 2aac2ggt tctgtgg2 2a iao 

GAGTATACAT TTTG2AA2CT CTTCGAG22C ATCCTGCAGT TCCAGATGAa 2CATAGOGTG 240 

GTr 2 A 2CAGN AAGG2C C 2A 2 A- 2AT 2TA FG< 2 A 2 A' 2- 2 A 2-2' 2- 2 AAGA 2G 2 ACC A; 2CTO 2AGA' 3 30 0 

■ j 2 ACCAi 2GOT ACA 2T 2 A 2A- 2 A G 2AGCT 3CT G 2G A 2 AG 2GG 1 2TGCAA 3* 2 C A ■ 3TG2G2ACG 2 3 60 

3CAG3TCCGA AG2A:ACG2T TG2A 2 .AAA 2' I* 2TCG22CAGA CG3GAA3AGC GAGTCCAAG3 42 0 

OTTCCTG2AG GC2TTG3AA2 T2AA3CGAG2 T2ACTG22TG G2GCGTGTG2 G2ACTGCATC 430 

MYCCTTCCTT TTCTTG2T2A AAG3CAC3TC 2TTTCCT2AT AATGAAT 3GT GTTCCCTTTG 600 

CTT03CTG3G GAGC 3CC CCA G3C 2AG2TTT G2TG3CCATA GATAGCTTT2 GG2T2CCTGR 660 

GA2AGG2TGC TGAG2AG2AT T'3AG2GTGA-\ AGTCTCCCAC GAGTACACTA AACCTA3GTC 72 0 

TG3T 2AC 2AA TAG2GTTT3G AGAG2AAAG3 O2CA0AA2TC ATCAGCT<3CC TGTCTCTTAG 780 

ATCOACTTTC TTTTrCCA2 2 AG2A 2ATCCT T2AA2A2ACA '2AATTTCAG3 CAA2AGTTCT 840 

CCCCAAAA2 2 CTAG2TCTTT ACC2TTC2AT TTTAG2CTTC CA 2CCAG2TT CCACAAAA3A 900 

TTTG2-2TCTA CCTT2GAT2T G OTA 3TAAYr AACTA\ FAG j CAGG2AGTTA TTTG3GTAA 1 2 96 0 

1 2 AA AAA A G 2G GT2G2AGAGA ■2A2AAAAITT G2C2ACTG2T G2TCCT2CCC TTGOSTYT2C 1020 

A 2CT GG2AT V TG 2 ' F ATT" 2 AA TCTCTACG2T NN 1052 



1C) STRANrOOriECS : dc ur > 
(LO TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEC" IO NO: 53: 
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ACTGCAACCT GGCAGGCAGA GGTTGCAGTG 
GCAACAAGAG TGAAACTCTT GT2TCAAAAA 
5 GTCATTACTG GTGGGATCTG GTGACACAAG 
TTGGTTAAAA AATTTTGTTT TTTAATTACG 
AAGATTGGAA TGTATCTTCA AATTCAGATT 

10 

AAAGTTGTAT TTAATCCCTT GT3CCCCAAG 
TATGAAAAGA TAG2AATAG3 GAATGGTGAA 
15 CAT<3GACTTA AACCCCATGA AAACTTGGTT 
ACAAAACCAG AGT3GTTTAC ATTC C ACAAT 
TTTNGGTATT TG3 3AT-3GGA TACTATTCAT 

20 



3! 8 



AGTGGAGATG GTGCCATT<3C TGTCGTTTGG 300 

AAAAAAAAAA ATGA3GTTTA A3ACAGTTTT 3-50 

ATAGCATTAA ACGT3ACAT3 03ACATAAAA 4 20 

TAATGTAAAA ( 3C 3CAAGAAA 1 2 ACTTT ATGC 4 30 

TAAT AAAC AT GTAAAGATCC TGTGTATATA 540 

AATGCTATAA AAGATrCCAA GAATGTTATC 600 

CAAATAATTT AATTT'GGCAA TTCTAAAAAA 6-)0 

CCATAGTTTT AAGTGTTTTA TGGTTCCAAT 7 JO 

MACCAAATTT GCATOCAATN TTGG-GGTAAT 7 SO 

TTTT si4 
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(2) INFORMATION FOR SEO ID NO: 59: 

( i ) SEQUENCE CHARACTERISTICS : 

:A) LENGTH: 1215 base pairs 
;B) TYPE : nucleic acid 
iC) STRANTEDNESS: double 
( D) TOPOECGY : 1 mear 



(xi) SEQUENCE DESCRIPTTCN: SEQ ID NO: 59: 

AGAGGAAGTG TTTTGCCAAG CCTGTTCTCT GGAGTAACGC CATCCAGGCT GGGA3GGGAA o0 

GAGTGCTCTG CTA3ACTG3T CCCCCTCCTG CGTCATGTTG CTTCTCAGCC TTGGTTCCT3 1J0 

ATGGGAACAG AATG3AGGGC CTGAGAACAT ACTTTCTAAA TGCCTTTGAC CCAGGAACCG 1R0 

40 ATTATCTATA TTTGTTCCCA TTTTGGTTCA CCGTGACATT CCAGC:ATTGT CTGACTGTGA 24 0 

GGTGGGCCTT TGA '. IAGC CTC CAGGTTCCTC AAAACAGGCC TGAGCGATGG GGATCACAC 3 300 

CTCTGCCTAC CCACRTGCCT GCTTACCTGC C AGATAAGC A AGT'GWAGAT 1 3 T3TGCGAGT3 360 

GCTAGTTTTC ACATTCTTAC TAGT3TTTG3 YTCACCTTTG G3CAAAGGCG C 3CTCTAGGC 42 0 

CTTGCCCCAC CTCCATCAAA C3CAGACACT GTAGTCAGAC CTCAGYAATA TAGGAGGCAA 4R0 

TAATCTTTTA ACAGTGTTTT G 2 AAAC AAA 1 3 AAAAAGAGAA AAATCCCAGC CAGGGGAACT 54 0 

CGCCACCTGC CCAC3CTAGT TCCATCCACG CTCAAGACCC GCCCTTAGAC CAGGCAGGCA 600 

AAGGCCCGCA TCA3ACTCOG CCACTAGTGG G3TCCTGAGG CCAAGAAAGA AACCAGACCC 66 0 

TGTATGACAA GTT<3GGKTCT TTCCAGAACA CGACAGAAAC AGGG3GGGCC CCTTGTTAAT 720. 

GCCACTCCAT ACTCCAGAAG CATTATTCCT TATTTGGGAC AGCCAAGGGC AGATTCAC7\G 780 

GTTATTGTAG GAATAAAGAC TAGTTTACAA AGGARAAAGA GSCCCTGGAC TTCCCMAGGA 84 0 



BNSDOC1D <WO 985496; (A 2 



WO 98/54963 



PCT/US98/11422 



319 



AAGGTCAGGT TAGGGCTCCT GTACCCATT3 T3TTCCACCA CTGTTTGATC TC TCTGGC 7T 90 3 

CCCACCAGGA AT3CCGTTTC CTTTTTATGG ATCTGTTGGG AACCAGAGAG AATCAACAGA 96D 

5 

TCAATGACAT AG 3 AT C C G AA GTOCAATOAT AGTCACTTCT AGTTTGGCAT TTCACAAACT 102 ) 

CTGNACA3CA AGGTATTG3T AGGTTACTCA ATTTCAAAAG GGCCCCATGG CC AAATATGT 108 0 

10 TTAGGAACCG CTGTTTGNAT TTCTTTTTTT GGAGACGCAT TGT AT AT AAT AT ATGT C AAA 1140 

GGCTTT03GA ATTOCT 3CAG GAAAGAAATC A:3C7TTTGTTA AATCCNAAAA AAAAAAAAAA 1200 

AAAAAAATAG ACTCG l'M^ 

15 



on 
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(2) INFORMATION FOR SEQ ID MO: 00: 

( i ) SEQUENCE CHARACTERISTICS: 

i A) LENGTH: 478 base pairs 

(B) TYPE: nucJeic acid 

(C) STRANDED! JESS : double 
(TO TOPOLOGY : linear 



( x l } S EC UENC E DESC P I FT T ON : SEQ ID NO : 6 0: 
ATTTCTTATG ACAT3GGGGT TTGAATT3GT TGGCAAATCT TTAATTTTAA TATCCATAAP" 6o 

CAGTGAGGTC CTGCTGGCTG TA^TCATTAA TTGTGAAATC TAAG3 A G3TT AGTTCATGG7 120 

TCTAGAATTT CACAGAAAAR TGYGMTATGA TACGAGCATT AAGTTTATTT GTTGTGATGT IRi'i 

35 TTGATGCAGC TTTGTTOAGT TTATCTGTTT TTGTATTTAT TGGT 7 A TOTA CTTOCCATG7 24- t 

CAAAAGGGAC TGGTCTACAT A3CT3CG3TA AACACCTGAT CAAATCACTA AAAGAAAATG 3 0O 

TGTTACCTCT AATGAATTAT CCTGATTGTA AGTTAAAAAT CAATATTTCC CCGTAGTGAG 3 6n 

40 

GTTTG3TTTT TAAAAA3AAK KCTTAAAAAA AAAAAAAAAA AAACGAGTTN AAGAAAAGGA 4 20 

AGCAAGCTCA GGT AAC ^ ; i TGC A07A7ATTGGG CTAAGGAAGC TAGAGGOTGT G iAGAN' -7 47R 

45 



(2) INFORMATION FOR SEQ ID NO: 61: 

50 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 018 base pairs 
(3) 'TYPE : nucleic acid 
(C) STRANDEDMES3 : double 



60 
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GCTACTAGGT AAGGCTTCTG GGAGTTTCAG ATATTTTGGG GAAGATTGAT TTTTOTTCTT 130 

ACATGCTGTG GACCCTTGGC CATCAAATGG TAT GGG3AAG CTCAT3CGTC TGT~TGTGAT 7 4 0 

5 

G3TCATGTCA GTCAGGCGTC TTTTTAGTAT TTACTGG3T3 CTCAGTACT'G TOC3AGATGG 3 00 

TGTCGG3AGC CGT* ,-GT 1 3 3 T A TGGAGGAGGA GTGGTC2AGA G3AC7CTGCT GTGT0GCAG3 3 GO 

10 GGAGCATAAA CAAGCCAAGG GGAAAAGGCA G3CATG3AAT AAAG3GG3AG AATA 2CAGTG 42 0 

TGTGAC TTAC TG3TGACTGT GTGGATTAGC CTATCAG3AG TAATCAAGCA GGGC3GAGGG 48 0 

C ATT AT 1 3TTT GAGOCAGAAG AGTGAGCACT <3G3CCGZGO^ TGGAGCATCA AGAG3<3GGTG 34 0 

TAGGACCNCA AG3CTTCTTN CNO 3GGAG AC AACGTCAAT A AGC NG T CAG T AGTGACCGAC 60 0 
A- JTTTTi 3C-GA A G : AAGG . 



15 
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(2) INFORMATION FOR 5EQ ID NO: 62: 

25 . i) sequence characteristics: 

(A) LENGTH: 7 5 1 base pairs 
; E) TYPE: nucleic acia 
(C) SiTRANDEDNESS : double 
i D ) TOPOLOGY' : 1 l near 

30 

(xi) GECUFJCCE DESCRIPTION: SEQ ID NO: 62: 

TCGACCCACG CGTCCGAG3A GCTG3AGTTC TGAGACAGCC ATTCT 7CTTG CAT AG3 ACT 3 60 

TOTGOTGCTA GAG 3TCAT AG AAGTCAACAA TTTTCTTCAA CACTOGTAGG CAGCCTCTAA 120 

ATGCrCCTGA T 3ACCCTCAC CTCCTG3 3AT TCACACCNNT GTAAAATTCC ACCCCT 3GAO 180 

CTAGTGACTC A 3TTCTAACA AN 3 A< 5 AAT AC AG3AAAAGTA ACATCGCTTC TGAGGT3AG3 240 

CTACAAG3AG ACTACGATGC CTGCCTT-3GT CACCCTTCTC CTGCTCTTTC CATTGCTCCC 3 00 

TCTGATGGAA G3CAGTTGCC ATGTG AT 1 3 AG GTG2 CCTATG GAG AG 3C C C A CGTGACAAGG 3 60 

43 TATTGTAAAA A3C 3TCTGAC CAATAGCCAT CTA3AAACG3 AG3CCCAGTC CAGCAG3CTC 42 0 

TGAGATGAAT CCTG3CAA3C TGAG3TT3GA GAG A 3 ATTCT CT3CCTATCC TGC C TT' G 3GA 4 80 

TGATCACA3C CAOCACCAAC ACCTTCACTG CCTG3TGAGA GGCCAAGCCA GTGAACCCAA 54 0 

50 

GGTAAACTG3 ACA3AATCCT GACCCACAGA AACTGAGATA ATGTTTGTTA TTTTAA3CTG 600 

CTCAGTTT3T TAG A 3 AG3 AA TAGATAACTA ACT'ZAAACAC CATAAAATTC T AATATTT T A 660 

55 TTCTATCACA CAAACCAGGT AAT AC C AAG T AAATGCCATT ACTATACACA TATTTTTGTA 72 0 

ACACAATTAC ATGTGATTTT T TAAG AA-: 3GC T .75,1 
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(2) 2:2E2~:-LA72 2N FOR SE3. ID NO: 62: 

(A) LELGTTH: 7 8 2 case pairs 

i3) Tr?Z: nuclei: acid 

(A) STRA2 22EDNE2 3 : double- 

( 3 ) TO ?C LCGY : 1 1 near 

t ;<; i ) SEC t2ST ICE DESCRIPTION: S EQ ID NO: 6 3: 

C>iGLCA2VT2A TyOTOCCCGA ttcccggc-tc gacccacgcg tccc<;gttgg caactcctga 



(2) z:2Ec?2lV7 2AN ror seq id no: 64: 



! • ' ~ E ' i 2 -j I " R C } {AG ACTE R 1 3T I C 3 : 

i A) LENGTH: 5B '3 base pair; 

' 'A 3TRA2TEDr7ESE : double 
(■A' A'\ ' v ~ A "GY . linear 



(xi ) SEC2T3 K2S DESCRIPTION: SEQ ID NO: 64: 
TTCCGAATTA AT2GACT2AC T AT AGG AAV/T GCCGTCGCCA TGACCCGCG3 TAACCACGOT 



hi) 



GCTARGD22A ATT22TA3AC (TDGCTCCAAA CI' A3TG ACT A CAOATAGAATT TGATCCCCTA 180 

AC~C-277GT'~ T GO ■ AIT' A 2 TC A2TGC TGCT A AAA 1 Z G A' ITGC CTGT02TCTC CTCTCAGOSG 24 0 

20 CAGCAT'AOTA A3GAO0C3AC GTCCTAATCC AACTAGA3AGA AGCGT2AGTG GT:XAAATTCA 2.0 0 

AGGCACTTSG 2. 2TGTC AADX2 DGGCAAGGGC CA(AATTAG CGAATAGAGC TAAGSGTTA3 ."i6 0 

CTCGAAAGG23 G0CT3-AG3A GACAGGGAAT GAGA3AGGAG GATCGGAACT AGACAGTOGC 420 

TGGTATG2GT CTGAG3A2TCC CTGGGGCCTG ■3T0AAGCTCC TCCTGDTCCT TG3T3TTTTC 4 8 0 



TCCTCATCTG AGACTGAAAT GPGGGGATCC S4 0 



50 0 



30 AG-GATGGOO^ TCCTTCCTCT AACCCTT22T CC2TCAG3CT GCAACCTCTA TCCTGGAACC 

TGGCT2CAO- TOOIOOOCAA CTATGCATCT GTTGTCTGCT C C TCTGC AAA GG3CA3CCAG 6 6 0 

GCGG2CGAA-2 GCTT2.TRNCC CTTTAA'STAA Q3C-3TTAATT TTTACGTTGG GCACTNGGCC 7S0 



60 
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AGTOCTCACA GGTCCCAGCA CCGATGGCAT TCCCTTTGCC CTGAGTCTGC AGCGGGTCCC 3 60 

TTTTGTGCTT CCTTCCCCTC ASGTAOCCTC TCTCCCCCTG GGCCACTCCC GGGGGTGAGG 42 0 

5 GGGTTACCCC TTCCCAGTGT TTTTTATTCC TGTG5SGCTC ACCCCAAAGT AT T AAAAGT A 4 80 

GCTTTGTAAT TCCAAAAAAA AAAAAJJ\AAA AAAA\AAAAA AAAAAAAAAA AAAAAAAAAA 54 0 

AAAAAAAAAA AAAAAAAAAA AAAANNCGO0 GSGG3GCCCC CCCCCCCC 538 



15 



(2) INFORMATION FOR SEO ID MO: 6^>: 



{ l ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 77 4 base pairs 
i 3) TYPE: nucleic acid 
f. C ) STRA^JDEDNESS : doufc 1 e 
20 -D) TOPOLOGY linear 

(xi) SEQUENCE DESCRI PTION : SEQ ID NO: 05: 

TTTAAACiATG AAOAAATGAC AAGGG AG< JJA GATGAGATGG AAAGGTGTTT GG AAG AG AT A 60 

25 

AGGGGTCTRA G AAAG AAAT T TAGGGCTCTG CATTCTAAC:C AT Ai K5CATTC TCGGGACCGT 12 0 

CCTTATCCCA TTTAATTAAT TTCTCTGACA ATTCAATTAT TTT<2TGTTAT TAATGTTGCC 130 

30 ACTGCTTTCT GTTTOTC IX3C ACTTTCTTGA TAAATATTTG CTATCGTTTT ACTCCAGT'CA 2 40 

TTCGATGTTG CTGAGATTTA CATATGACTC TTGTCAACAT CTCAT 2TTTT GACCCAATCT 3 00 

T ATT C ATT T A ATAAGAGGTC TCATT 2ATTT GCATGGAAAA ATG2TCATTG TATATTXTAA 360 

35 

AGTGAAAATA ACGAGTTOGA AAACAGT3TA TACATATATG TGTGTATATA TGTACACTTT 420 

ATTTGTACAT TTCTATGTGA CATAATGCAA A3GAAAGTGT CTGATTTTAT TATACACCAA 4 80 

40 AGGTTAACAG TGAATCTCTG TGTGATCTCT TTTTTTTTCT TTTTGC CT AT CTGCAT 2TTC 540 

TCACTTGCCA AAAAATGAAT ATATG TTT AT GTG TGT AT AT TACTTGTGTC ACAAAAAACC 600 

CTAAAGTAGA CAGTAAAAGA ACTTGTCAAT CGCCTTTGGA AGGCAATGAA ACACTTAATA 6 60 

45 

AACTCTCAAT AACAGAAG2G TAAAAATGAA ATGTAAACCT CCAATTACCT CTGGATCTCT 72 0 

T A' 3CCAGAGT AATAAACTGG TAATTATTAC AGATAAAAAA AAAAAAAAAA AANA 77 4 

50 



(2) INFORMATION FOP SEQ ID NO: 66: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1866 base pairs 

(B) TYPE : nucleic acid 

( C ) STRANDEDNES S : doub 1 e 

(D) TOPOLOGY: linear 

60 



BNSDOCID <WO_ 98M96.3A2 



WO 98/54963 



PCT/US98/1142 



(xi) SEQUENCE DESCRIPTION: SKQ ID NO: bo: 

ACCCACGCGT CCG3TCCTCT TCTTCAGCAC ATGCCAAAGC TGTTCCTCAC G3CCTGTGAG 60 

ataagagcat cTTTGA'rGTA ggacaatgga agagttagat gccttattgg agoaactgga 120 

A 7G7TCC ACC CTTCAG3A2A GTGATGAATA TTCCAAC7CA G3TCCTCTTC C 2CTOGATCA 1 • ;| .' 

G3Attccaga aagtagacta aggttgatga gacttcg^ag atcctttcta ttcaggata^ 24o 

CACAAGTCCC TTG7COG7GC ANTCGTGTAT ACTACCAATA TCCAGGAGCT CAATGTCTAC 3 CD 

AGT GAAGCC 7 AAGAG7CAAA GGAATCACCA C7ACCTTCTA AAA73TCAGG A0CT7CTCAG 360 

TTCGATGAGC TCAT3G7TCA CCTGACT7AG ATGCAGG7 CA AGGTTGCAGT GAGAGCAGAT 42ii 

GCTGGCAAGA A 3C AGTT ACC A* GACAA 07 AG GATCACAA3G CCTC7.CTOGA CTCAATGCTT 4 HO 

■3GGGGTCT8G A3CAGGAATT GG7ACSGAGGTT 1 3GC ATTGGG A CAGTGCCCAA CaGGC:.:ATTGT b40 

ccatcctgo: agaaaccgat t:octog:;aag otgatccatg ctctatooca atcatggcat so-> 

7CTGACX7ATT TTGTOTGTAC TCATTGCAAA GAAGAGATTG 'TGTGGACTGC C7TCTTTCAG 6£0 

CGGAGTGGCT TCGNCTACTG CCCCAACGAC TACCACCAAC TTTTTTCTCC ACGCTGTCCT 720 

TACTGCG7T 3 CTCC 7ATCCT G GAT A AA ! 7 TG 7TGAGACX7AA TGAACCAGAC CTGCCACCO\ 730 

GAGCACTTCT TCTGCTCTCA CTGCGGAGAG GTGTTTGGTG C AG AA GC j 7 TT T3ATGAGAAG 84 0 

GACAAGAA32 CATA1TGCCG AAA3GA7TTC TTAGCCATGT TCT2ACCCAA GTGTGGTGGC 900 

TGCAATCG7C CAGT3TTO0A AAAGTACCTT TCAGCCATGG ACA3TGT 2TG G2AGGGAGAG 960 

TG7TTTGTTT GTOGG3A7TG CTTCACGAGT TTTTCTACTG GCTCCTTCTT TGAACTCGAT 102 D 

GGACGTCCAT TCTGTGA7CT CCATTACCAT CACCGCOGGTi GAAGTCTCTG C :ATGGGTGT 108 0 

G07CA3CCCA T0ACT:.X3CCG TTGTATGAGT GCCATGG7GT ACAAGTTCCA TGCTGACrCAC 114 0 

TTTGTGTGTG CTTTCTG7CT GAGACAGTTG TCG AAG GGC' A TTTTCAGGGA G2AGAATGAC 12C0 

AAGAC 7TATT GTOOXAOCTTG G TT C AAT A AG CTCTTGGCAC TGTAATGCCA ACTfiATCCAT 1200 

AGOCTCTTCA GATTCCTTAT AAAA' 7 IT A/-. A GCAAGAGAGG AGAGGAAAG7 GTAAATTTTC ^ 3 / 0 

rGTTACTCAC C 1TC30 IGTTA ATAGTGTTAT AGAAAAAGGA AA0GTG YTGA 7CA\ATA\AG 13P0 

GAA7TTCTAG A 7TTTAGATG ACTAG3CTGA TAATGTTATT TTTTAG3<3TT 3TATACAGTT 144 0 

AATTCTATAA ATTGTGTTTC TCCCTCTCTT 7TCGAATCAA GCACTTGGAG TTAGATGTAG 1500 



WO 98/54963 



PCT/US98/11422 



324 

TTTTGTTTTT CAAGAGGAAG TAGATTTTAA CTGGACAACT TTGAGTACTG AG ATG ATT G A 1300 

TAAATAAAGT GGCTTGTGGT TTCAATAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA I86 0 

5 AAAAAA 1366 



10 (2} INFORMATION FOR SEQ ID NO: €7: 

( i J S EC.'UENC E CI IARACTER ISTICS : 

(A) LENGTH: 115 2 base pairs 
{ B ) TYPE: nucleic acid 
15 (C) STRANDEDNESS : double 

(Di TOPOLOGY: linear 

(xi) SEQUTINOE DESCRIPTION: SEQ ID NO- 67: 

20 CTCAAGGATG TAAAGGOTCT GOAGATTTGG GGAGGCCTGT CTCCCAGCAC CTGATGGGAC 6C 

ACTTTTTGGC CC>.CTG?AA;, TTCTGGGTGT ATCCTCCACT GTATGCTGTC AC2CCAAGGG 12 0 

CAAGO\CTG<~ ATCTGCTTAG TGAAGGATTT ATTGTTCGGA AGATACATTT TCCCCTTKAG 130 

25 

CAGA3AGTOG CGTATCCTGG CAGTCTTCGG TGAGCCAGTT GTACC/iGGAT TAT3AAATGC 2 4 C= 

AGATGTTTAC TGTGTCATTG TTGOTGTCAT TGCTACTGAG GAGTACTGAC GAGAATCATC 300 

30 T'3CAACTYTT AGTTO^^AG A GAGGACCACT ATO3CGGGTA GCTCTTTTCT TTCCTGCCAT 3 60 

TGTGGGGAT3 ATTCCAGOGC AAAGAT< 1ATG GAEAAGTATG GAAATCATCT GAAAGGTT GA 42 0 

AGCTTGGOAC GTGAAGCCAT TCATGACTTT GTAAQ3CAGT TTTIXTTGAAG GCCAGTTCrTG 48 0 

35 

CCCTGGGAGG GACGGAGGTG AATCCTCCTG AGTACCTGTG GTTTTCTTAC 7TCCTGCTGA 54 0 

ATTTACCTAA GTGCCTGTTG TTT3CTTGCT GTGGAGGCTT TCTGGTATTT CATTTCAGGT 600 

40 GCAGATGCCT TCACTTTCGC ACC RAAAAAA CCCCMACCAA ACCTAAGACC TTACT3CAAC 6 60 

T AAGTYTNC C AAGTACTTTT TAACCCAATG (GGATGAACAG CCTGTGGTCT GCTCAGATCA 72 0 

CCCTGAGTGC GTGTGAGAAG OIMTNGGCTT TOO C AGG AAA TCCAGGAAGG CAGGGCCGGG 7R0 

45 

CTGTGTTGGA AG ITTGG2TTA G 2T GGTGGGG CAGCCTTATT TCAATTAAAA 'GGGCATTGAC &4 ) 

TGGGAGCAGC AGTCCT3GAG TTTGTTGCAT TTCCTATTQ? CCTCAAAATG AG AAAC < 2AGG 900 

50 AAAATAGCAG ATTGGAGGCT TCGAGAAG2G AG T AAATGG 2 TGTTTTTATT GAC AAAAGG A 960 

AAACATTTTA GT3CCATCTO A 2TGATGGC A TCTCACTGAC TTAAAATGAA GGCANGTTGT 102 0 

AGTAAAAAAA AAAGTCTACA TTTTTCCACC GCCACGTTCT TATATC 2TGT TTGTCAG2CA 1080 

55 

CTGCTCANAA GGGCATGTTG T2TTGCGGAN TANAGGCGOT CTGCTTCCCT CGTTTTCCCT 114 0 

ATAGGTTGGG TG 1152 

60 
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10 



20 



30 



40 



50 



(?) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 243 3 base pain 

(B) TYPE : nucleic acid 

(C) STRAND EDNEJ3S : double 

(D) TOPOLOGY: linear 



2ctc 



TT "TGAGT'CA OCTTP 



6 0 
120 
1 K 0 
240 
3-"K) 
360 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

agcaggosgt gcggt:}g 3gg cgggag* :agc gogkagcccg gcto3gccac accgatcgcc 

15 CG2CGCCATG G3CTCCTOSC AAAGCGTCGA GATCCCGGGC GGGGGCACCG AGGGCTACCA 
CGTTCT303G GTA2AAGAAA ATTCCOSAGG ACA2AGA3CT GGTTTGG A 2G CTTTCTTTGA 

ttttattgtt tctattaatg gtt 2aagatt aaataaagac aatgacact2 ttaaggatct 
gctgaaasca aacgtt^aaa accctgtaaa gatscttatc tatagcasca aaacatt3ga 

A7TCCGAGAG ACCTCAGTCA CACCAAGTAA C C T GTGGGG 2 GGCCA203CT TATTGGGAGT 
25 GAGCATT2GT TTCTCCAGCT TT3ATG-3GGC AAATGAAAAT G TTTOGC ATG TGCTGGAGGT 420 
GGAATCAAAT TCTCCTGCAG CACTGGCAQ3 TCTTAGACCA CACAGTGATT ATATAATTGG 4 30 

AGCAGATACA GTCATGAATG AGTCTGAAGA TCTATTCAGC CTTATCGAAA CACATGAAGC 54 0 

AAAACCATTG AAACTGTATG TGTACAACAC AGACACTGAT AACTGTCGAG AAGTGATTAT 600 
T AC AGO AAAT TCTGCATGGG G1X;<3AGAACX2 CAGCCTAGGA TGTGGCATTG GATATGCTTA 
35 TTTOCATCGA ATACCTACAC GOCCATTTGA GGAAGGAAAG AAAATTTCTC TTCCAGGACA 
AATGGCTGGT ACACCTATTA CACGTCTTAA AGATO3GTTT ACAGAGGTC 2 AGCTGTCCTC 
AGTTAATCCC CCGTC2TTGT CACCACCAGG AACTACAGGA ATTGAACAGA GTCTGACTGG 
ACTTTCTATT AG2TCAACTC CACCAGCTGT CAGTAGTGTT CTCAGTACAG GTGTAC CAAC 
AGTAOCGTTA TT2CCACCAC AA 3TAAAC 2A GTCCCTCACT TCTGTGOCAC CAAT3AATCC 
45 AGCTACTACA 'TCACTAOSTC T3ATGCCT2T ACCAGOAGGA CTGCCCAAOC TCCCCAACCT 
CAACCTCAAC 0TCCCAO2AC CACACATCAT GCCA^TOTT GGCTTACOAG AACTTGTAAA 
C7CA2GTCTG CCACCTCTTC CTTCCATGCC TCCCCGAAAC TTACCTGGGA TTGCACCTCT 
CCC7CT02CA TCC3A7ITCC TCCCGTCATT CCCCTTGGTT CC AG AG A- O 2 T CTTCT3CAGC 12 00 

AA ^T^AC^A G AOCT "V2 TOT r TTO~' : "TCCC OCCCACCAOC AACGCAC2CT CTGAC 



66 0 
720 
780 
P40 
900 
9 6 0 



114 0 



60 
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TTGGAATTGG CGT<"rGTATAT TTAACCACGG GAGCGTGTCT OG AAACGCAA ACT ATC ATT A 150 0 

ATTTCATACT AGTTTGTACC GTATCTGTAG GCATCCTGTA A-YFAATTCCA AGO^GAAAAC lb 6 0 

5 TAAACGAGGA CGT< 5GGTTGT ATCCTGCCAG GTTGAGTO 3G GCTCACACGC TACtGGTGAGA 162 0 

TGTCAGAAAG O^CTTGTATT TTAAACAACC AAAA^GAATT GTAA303T"5G CTT' ^CTGCCA 1680 

GGCTTGCACT GCCGTTCCTG '.iGGGTGTGCA TCTTGGGGAA A3GTGGTOJC <3GOJCGTCCA 17 4 0 

10 

CTAOjTTTCC TGTOCCCTGC TGCTCCTTCC GTAAGAAAAT GAAATATTCT ATOTCTAATA 180 0 

CTCACACGCA ACATTTCTTG T ACTTTGT AA GTC 3TTTGCG AGAYrGCVIA CCACCTCACT I8 6 0 

15 AAAC TGTAAA C' ^GTAAAGA* j ATTTTTACTT TTGGTCTC 2G TGAGTCG2AT ■ 2TCTACTAAG 192 0 

GTTT AC AC AG GAATTCCACC TGAAGACTTG TGTTAAAGTT CTA 2AGC02G -lArTGTTAAC 19 30 

TGAACGTCTT TTTCrTCAGC CTATACGCGG ATCCTTGTTT TGAGCTCTCA 2AATCACTCA 204 0 

20 

G AC AAC ATT T T '.JTAAOTGCT >rTGTTG2TT TC TAG AT AC A CCTTATAAAG TG ACATTTCA 210 0 

AAAGAAATAA ^GTGCCACAG TTTTAAACCA GAA3GTG3CA CTCTGTG3CT CCTTGTAGTA 2160 

25 TTATAGCTAT ACT 3GGAAAG : 2 AT AG ATAC A GCAATAAAGT ACAGTAATTT T AC TTTTTTT 222 0 

CTTGTGTTAC ATCTAAATTA CAACCCTTAA TTGCC AC 3TG TCCACTTACT ACTCTCCAGT 22 8 0 

ATGTCTTATT ACTCTCCAGT ATGTCACGCA TCTTTAACTT TTCACGTCCT ATGTTTGCTT 234 0 

30 

TCTCCCATTT TTAAGAGATG GTAAGTTAAC TGGAATTGAT TTACTGAATG AAATTAAATG 2400 

C AG AT ATC CC TGTTTTTG A A ATAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 24 60 

35 AAAAAAAAAA AAAAAAAAAA AAA 2 483 

40 (2) INFORMATION FOR 3EQ ID NO: 69; 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 53 6 base pairs 
(E) TYPE-, nucleic acid 
45 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 



50 GAGAAATGGA OOTTTGTTAG ATAAAAATTT TTTCAACGCA AACAGTCATT TTCCAGTGAA 60 

AGG AGAGC GT ATCCGOCGTA GGATGGACTT AGATCGTGTA AAAGCTGAGG CCACCGAGGA 12 0 

TATAACCTCC G3GGTCCTTT GCCTCCTTTT CCTTAG AC T C CCTCCAAACT CGTGTATCTT 180 

55 

TCCTTCAGCA GTACTGGGCT CCACGCGAAC CTAGTCCTTT GTCTTTACCC TATTACCTTT . 24 0 

CAT AAC ATC C TAGTTGAAAA GTARTTATTC AAC CGC GTTT GAAAATGAGA AC A 1 3GTTC AC 3 00 

60 AGARGCTAGG TT A 1 2TTGCG A AGGTCGTTCA ATTAGTAACC AGTAACGCCA G3A2TGCCAG 3 60 



pn s doc id -wo e> 4 -?e ? a ? i 
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TTTCTTGCTT C C G AATT C T C ATGGTAGCTT TCACCARGCT CCCCGTCMAA TGCTAACGTC 42 0 

AACTACTGAA C TAG ATT AGO A AAAAGGTC T TTTAACAGA.A TTCCTGGTTT TCAC AGAGAG 43 0 

5 

TTTCTTTCAT GAAGCGCCCC ATTTCTACAG AGG AAAAT AA ACTCCAAGCA G-CCAGT 53 6 



10 

(2) INFORMATION FOP SEQ ID NO : 70: 

U ) S EQ< JET JC E CHARACTER I ST ICS : 

(A) LENGTH: 865 base pairs 
15 (B) TYPE: nucleic acid 

( C ) STRANDEDNESS : doub 1 e 

( D ) TO PC LOGY : 1 inear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

20 

CCACGCGTCC GGCCTTTCTT 3GCCAGAGGC GCCGGTTGGA CTCACGGGCG GGGOATGATG 60 

(3GTAACAGGA CC3GTGGGGT CCCCAGGAAG TCCTAGAGGG GGTCGGGGTT TOO 5TGGACA 12 0 

25 AGCTTTCCTC GTCCTCTCCC GACAGAGCTG ACGTGTCCTG GGTTCCACCG G3-AGCGGGCA 180 

TTTCCACCGG AC3GGA<3GGT TCG303TGTC CGGGGCTGG3- GAATACGTAG G3 J GTTGCCGC 240 

GCGGTGT3GG GAGTTG>3GGC G T 3TO 3CTGC AGTCCCG3GA GTTCTTGG AG O 5GGTCGGCC 3 00 

30 

CACCGAGCTT CCGGAC 3GGC TGATCTGCCC GTA'SCTTGCC GGANGGARGG C'^GAGCTSAC 360 

TCTCCGTCCC TTCTCCCATC COZTZCAGTG GTGGGTACGG GC ACCTCGCT G3C 32TCTC : 420 

35 TCCCTCCTGT CCCTGCTGCT CTTT3CTGG-3 ATGCAGATGT ACAGCCGTCA 3CTG3CCTCC 430 

ACCGAGTG3C TCACOATCCA GGGOSGCCTG CTTCGTTCGG GTCTCTTCGT 3TTCTCGCTC 540 

ACTGCCTTCA ATAATCTG3A GAATCTTGTC TTTGGCAAAG GATTCCAAGC AAAGATCTTC 600 

40 

CCTGAGATTC TCCTGTGCCT CCTGTTGGCT CTCTTTGCAT CTGGCCTCAT C 3A3CGAGTC 660 

T3TGTC AC 3 A CCTGCTTCAT CTT'C rCCATG GTTGGTCTGT ACTA 3 ATC AA CAAGATCTCC ^20 

45 TCCACCCTGT ACCAG3CA3C AGCTCCAGTC CTCACACCAG C3AA3GTCAC AG^AAGAGC 7 30 

AAGAAGAGAA ACTGACCCTG AATG TT 2AAT AAAGTT3A r IT' CTT'?3TAAAA AAAAAAAAAA 840 

AAAAAAAAAA AAAAAAAAAA AAAAA 865 

50 



60 



: kan; e: t : : ■ loui.A 
POLO 33 : linear 
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(xi) SEQUENCE DESCRIPTION: SE2 ID NO: 71: 

TCATCATATA CAAAGTTTTT CGTCACACTG CAGGGTTGAA ACCAGAAGTT AGTTGCTTTG 60 

5 

A'^AA'CATAAG GTCTTGT3CA AGA3GAGCCC TCGCTCTTCT GTTCCTTCTC OGCACCACCT 120 

GGATCTTTQ3 GGTTCTCCAT GTTGTGCACG CAT'ZAGTGGT TACAGCTTAC CTCTTCACAG 1R0 

10 TCAGCAATGC TTTCCAG 3GG ATGTTCATTT TTTTATTCCT GTGTGTTTTA TCT A< 3AAAGA 240 

TTCAAGAAGA ATATTA2AGA TTGTTCAAAA ATGTCCCCTG TTGTTTTGGA T3TTTAAGGT 3 DO 

AAAGATAGAG AAT3GTG3AT AATTACAACT G3ACAAAAAT AAAAATT02A A 3CTGTGG AT 3 6 0 

15 

G ACC AATG T A TAAAAATGAC TCATCAAATT ATC'CAATTAT TAACTACTAG A 2 AAAAAGT A 42 0 

TTTTAAATCA GTTTTTCTGT TTATGCTATA GGAACTGTAG AT AATAAG 3T AAAATTATGT 430 

20 ATC AT ATAG A TATACTATGT TTTTCTATGT GAAATAGTTC TGTCAAAAAT AGTATTGCAG '54 0 

ATATTTGGAA AGTAATT3GT TTCTC AG 3AG TGATATCACT GCACCCAA3G AAAGATTTTC 60 0 

TTTCTAACAC GAGAAGTATA TGAATGTCCT GAA3GAAACC ACTGGCTTGA TATTTCTGTG 66 0 

25 

AC T CGTGTTG CCTTTGAAAC TAGTCCCCTA CCACCTCGGT AATGAG 3TC C ATTACAGAAA 72 0 

GTGGAACATA AGAGAATGAA GGGGCAGAAT ATCAAACAGT GAAAAG3GAA TGATAAGATG 780 

30 TATTTTGAAT GAACTGTTTT TTCTGTAGAC TAGCTGAGAA ATTGTTGACA TAAAATAAAG 840 

AATTGAAGAA ACACATTTTA CCATTTAAAA AAAAAAAAAA ACTNGA3GGG GGC CCGGT AC 900 

CCAAATCGCC GCATAGTGAT CGTAAACAAT CT 93 2 

35 



(2) INFORMATION FOR SEQ ID NO: 72: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 996 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
45 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

CGCCTGGCAC CATGAGGACG CCTGGGCCTC TGCCTGTGCT GCTGCTGCTC CTGGCGGGAG 60 

50 

CCCCCGCCGC GCGGCCCACT CCCCCGACCT GCTACTCCCG CATGCGGGCC CTGAGCCAGG 12 0 

AGATCACCCG CGACTTCAAC CTCCTGCAGG TCTCGGAGCC CTCGGAGCCA TGTGTGAGAT 130 

55 ACCTGCCCAG GCTGTACCTG GACATACACA ATT ACTGTG T GCTGGACAAG 'STGCGGGACT 240 

TTGTGGCCTC GCCCCCGTGT TGGAAAGTGG CCCAGGTAGA TTCCTTGAAG '3ACAAAGCAC 300 

GGAAGCTGTA CACCATCATG AACTCGTTCT GCAGGAGAGA TTTGGTATTC CTGTT3GATG 360 

60 
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10 



25 



35 



45 



ACTGCAATGC CTTQ2AATAC CCAATCCCAG TGACTACGGT CCTGCCAGAT ^TCAGCGCT 420 



AAGGGAACTG AGACCAGAGA AAGAACCCAA GAGAACTAAA GTTATGT2AG CTACCCAGAC 



GTATCTCTCT ACCTTCT 3G A AAACAGGGCT G3TATTCCTA C2CNGGAACC TCCTTTGAGC 

ATAGAGTTAG CAACCATGCT TCTOATTCGC TTGACTCATG TGTTGCCAG3 ATGGTTAGAT 

ACACAGCATG TTGATTT3GT CACCTAAAAA G AA< GAAAAGG ACTAACAA3C TTCACTTTTA 

T^AACAACTA TTTrGAaAAC ATGCACAATA GTAT3TTTTT ATTACTGGTT TAATGGAGTA 



( l ) SEO^-fENCE CHARACTERISTICS : 

(A) LENGTH: 785 base pairs 
30 ( B > TYPE: nucleic acid 

(C) STRANDEDNESS: double 

( D ) TOPOLOGY : 1 mear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 73: 
GGCACGAGGG GCrTTGCGTA CACAATAGCT GCTAGGAGTA CCCAAA3CCT GARTACARCC 
TGCTGGTGTC ATGGCCACGT GTGAGCAGGC CAGCGTCAMA CGGCTC3CTG TGACCCGTCC 



480 



TTAATGGGCC AGAG2CATGA CCCTCACAGG T2TTGTGTTA GITGTATCTG AAAC TGTT AT 54 C 



600 
660 
72 U 
78m 



15 ATGGT ACT TT TATTCTTTCT TGATAGAAAC CTGCTTACAT TTAACCAA 3-3 TTCTATTATG 84 0 

c , rTTTTTCTA ACACAG A 2TT TCTTCACTGT CTTTCATTTA AAAAGAAATT AATGCTCTTA 900 
AGATATATAT TrTAYGTAGT GCTGACAGGA CCCACTCTTT CATT3AAA jj T3ATGAAAAT 960 

20 

CAAATAAAGA AT 3 TCTTC AC ATGARAAAAA AAA-AAA 
(2) INFORMATION FOR SEQ ID NO: 73: 



6 0 
120 



40 CGRAGACTGA AATGGGCCTG GGTCTTCTC C TKGTCCTGTG ATWAAAGTCC TCTCTTGAAA 1P0 
GTGGAGAGCA AA3GCACACA GAGGTGCGCG CTCACAAGAA TTCCTCCCGG T3ACTG3GTA 
ATCAATGTTA CTG-3TOTTTC CTTTGCAGGA AA : 3 AC C AC AG CAA'^ATTTTT T3ATTCGTCT 
CCTCCTA3CC TG3G3GACCA G3CT2GAACT GACCCTG3AC AT 3AAAG3AG G3ATTATGTG 

GCTGCTAAAG CCAT03G3CC ACAGCCCTGT TC ACRTCTTG GT3CTTCTCT TTCCCAGAGG 42 0 

50 CTGGTCCCAG CCAGGCACAC ACAAAAiGGCA GATTCTCGTA AAC SC AG2 : 3T CCCTCCCTGG 480 

AGGCTGCCTC CTO2CCT03A TCTGGAGTGG A'GC TGC T CTG AGATTTICAG TTCTTCTGCA S4 0 



240 

3 00 
>»0 



(>() TAAATGTAAT - "a : ATTT TTTT TTTGTAAAAA A3A^AAAAAAA AAA AAA AA A A AA AAA AAA A A 
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AAAAA 7 ^ S 



5 



(2) INFORMATION FOR SEQ ID NO : 74: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH : 1069 base pairs 

(B) TYPE: nucleic acid 
( C ) STRANDEDNES S : doubl e 
(D) TOPOLOGY: linear 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

TCCTCACCAT TCCCCTAGGN ( ~S AGGTC CCTG CAGGTCCCAC ACTTCTCCCA GGTCCCTAAA 60 



CTTGGGTCGG TCCTTTCCCT GGAGTAGCTG GNTCCTCCAG TCGAGGTCCC TGTTCAGTCG 12 0 

20 

GTTCTTAGGC TCCTGCACAT GAAGGTGTGT GC CTGTGGTG TGTGGGCTGC TCTAGGAGCA 180 



GATACA03CT GGTATAGAGG ATGCAGAAAG GTAGGGCAGT ATGTTTAAGT CCAGACTTGG 240 
25 CACATGGCTA G3GATACTGO TCACTAGCTG TGG AGGTC C T CAGGAGTGGA GAGAATGAGT 300 



AGGAG3GOAG AAGCTTCCAT TTTTGTCCTT CCTAAGACCC TGTTATTT3T GTTATTTCCT 360 



GCCTTTCCGA GTCCTGCAGT GGGCTGCCCT GTACCCTGAA CCTCATGAGC CTCTAAGGGA 420 

30 

AAGGAGGAAC AATTAGGACG TGGCAATGAG ACCTGGCAGG GCAGARTACA AGCCCAGCAC 480 



CAGTGTSCCA GCCTTACTGG GTCCTTACCC TGGGCCAAAC AGG3AGGGCT GATACCTCCT 54 0 

35 TGCTCTTCCT AGATGOCCAC CTCCTACAAT CTCAGCCCAC AAGTCCTCTC CACCCTAGGG 600 



GGCTTGCT 3C ATGGCAATAA CTCATAATCT GATTTGGAGG TTTGCCCTTT ACAGGGGCAG 660 



ATTTTCTGCT CAGTTCAACA ATGAAATGAA GAGGAACTCC CTCTTTCTAC AGCTCACTTC 720 

40 

TATCAGAG3C CCAGGTGCCT C AG AGCCAC A TTGAGTTGCT TTTTCTGGGA TGAGGAAGTA 7 80 



GGGTTAAACT CCCCAGTTTC CTGAGGGAGG CTCCTGACAG GTGCCCTTTG TCAGACCCTA 840 
45 C CAC AGS C TG GATAGGCAGC CACATTGGTC CTCGCCCTTG CTCGGNACTC CGTGGTGGTC 900 



CTGCCCTTCT CCCTGCATGC CTGTGGGTCT GCTCTGGTGT GTGAAGGTCG GTGGGTTAAC 960 



TGTGTGCCTA CTGAACCTGG CAAATAAACA TCACCCTGCA AAGCCAAAAA AAAAAAAAAA 102 0 

50 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAA 1069 



55 



(2) INFORMATION FOR SEQ ID NO : 75: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 831 base pairs 
60 (B) TYPE: nucleic acid 
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(C) STRANDEENES5 : double 
{ D ) TO PC LCG7 : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

5 

GG AC ATT AG A TCACTGT3GA CCTAAAACAA AC AAA C AACT ATAAG3AAAA TOGO ATT AC A 60 

AAT03TCTGG 03ATCAGTTT ATCACTGCAG TTGTTACATC ACCCCATGGT CTAAAATA 1 2A 12 0 

10 GA03TTTAGT CTGTCTCTGT TTCAGTTCAT TTTACAGGAG GTGAA2ATCA CACTTCCA3A 180 

AAACTCTGTC TG2TATGAAA ' 3GT AT AAATT TGATATTCCT GTGTTTCACT TGAATGGCCA 24 0 

GTTTCTGATG ATGCATCGAG TAAACACCTC AAAACTTGAA AAACAGCTCC TGAAACTTGA 300 

15 

G2AGCAAAGT ACTG3ARGCT GACTGATGCC CTCATGATTT TCCACCCTCT CTTCCCATAA 3 60 

AGOATCTTC3 TAAG3AAATG AMOATGOCCT <'IATACTCATT TTGTCACTTG TACAGAGCCC 42 0 

20 TA-Y3GATGTT <"Ti-;AATT^AG TGGT3CCAAA TAAATGTTGA CATTCCCCTT 'IT 3GTTGATG 48 0 

GAAGTATCAO TGT-3GGAACT 3TTT>3GTTAA T030^TTTTA TAAAATAAKA AKAK C AT A' IT 54 0 

A3CA3GGAG3 GAGATGATGG AG3GAG3GAG AAGTCCATTT GTCTTATTTA TCCTTTTT jT 6 0 Ci 

25 

ATTAATAGAG AAGCACTTCA CAGTCACTGG CAATGCCATT TATAG3AAGA AGGTTCTGCA 660 

TTCCTCtCTGC TCCCGGAGGG CTTAACTTTT TAAT3AAAGA ATAAATGCTC TTCCACTCAG 72 0 

30 TAGATAAAGT GAAATGTGAA TTGTTAATAA CTGT3CACGG TCAATAAA3C GATOTTTTAA 780 

GGAATACAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAACTCG A 831 

35 

(2) INFORMATION FOR SEO ID NO: 76: 

(i) SEQUENCE CHAFACTERISTICS : 
40 (A) LENGTH: 590 base pairs 

(B) TYPE : nucleic acid 

( C ) STRANT EDNESS : double 

(D> TOPOLOGY : linear 

45 {>:i> ?F.Qt-Er;CE DESCF IPTION: HEQ ID NO: 76: 

T A ' 2 A T AT AG A CNCTTAATAO TCOTGANTGN TGTGNACGAA C ATT AA CGGA A.G T AGO AT'" \ T 6 0 

AGCCAGTC3A AT AAG NT' AT A AG3ACAAAGT 3GAGTCCACG CGTGCGGCCG TO TAG ACT AG 12 0 

50 

TGGATCCC2C O 3CTGC AGGA TTCGGC AC 1 3 A GCTGCCAGGT (3AGGAGCAGA GAGACTGTTC ISO 

CCTTGGGTGG A3AG3TGTOG GCATO^AGAGC CAC CCATTGO CAAGCAGCAA GAATG TTC GT 24 0 



60 
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ATGOA ;CCTC ACAGAAGCAG 
AOATOGSGCT CCCTTAGAAC 
GGGGGCCCGG TACCCATTCG 



r;« JTCTO20TC CACTTTACCA 
TT ACTC ' 2AC T ( 2 ATTTAAAAA 
CCOTAAAAGT 



(2) INFORMATION FOR 3EQ ID NO: 77: 

( l ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1274 base pairs 
(P) TYPE: nucleic acid 

(C) SI HANDEDNESS : double 

( D ) TOPOLOGY: linear 

(XI) S E 2UENC E D ESC R I PTI ON : SEQ ID N 2 : 77: 
GAGCCAXAC ACCTO3CCTG 1 2 AAG2 AACCT CTTAAAATCA GTTTACGTCT TGTATTTTGT 
T< 7TGTG AT'- > 2 AOS AC ACT: ^3 AGA 1 IASTT 3C TA' IT 2CAG' T 2 .-\A' D 2A' P 3TO 3 AGT' 2AC TO 2A 
CTCTGAAAAT C 2TATTG3TT 2CTTTATTTT ATTT3AGTTT AGAGTTCCCT TCT03GTTTG 
TATTATGTCT O2a2AAATGA0 CTOGGTTATC ACTTTTCCTO 0AO3GTTAGA TCATAGATCT 
T03AAACTCC TTA'SAGASCA TPPPSCTCCT ACCA^GGATC AG ATACTICS A GCCCCACATA 
ATAGATTTCA TTT ! 2ACTCTA 02CTACATAG AGCTTTCTGT T02TGTCTCT T02CATGCAC 
TTGTGC2GTG A'P r AC AC ACT TGA'TAGTACC AGGAGACAAA rGACTTACAG ATCCCCCGAC 
ATO2CTCTTC CCCTTOGCAA 02TCA2TT02 CCTGATAGTA 02ATGTTTCT GTTTCTGATG 
TACCTTTTTT CTCTTCTTCT TTOCATCA 2*2 CAATTCCCAG A^TTTCCCCA OG2AATTTGT 
AGA03ACCTT TTTG03GTCC TATATGAGCC ATGTCCTCAA AGCTTTTAAA CCTCCTT02T 
CTCCTACAAT ATT 1 2AGTA I! A TGACCACTGT CATCCTAGAA GSCTTCTGAA AAG AGGO 2^ 2 A 
AGA02CACTC TGCG2CACAA A03TTGGGGT CCATCTTCTC TCC3AGGTTG T2AAAGTTTT 
CAAATTGTAC TAATAGGSTG 03GC2CTGAC TP202T 3TOS G2TTT03GAG G3GTAAGCT3 
■ 2 TTTCT AG AT CTCTCC 2AGT GAOSCATOSA OSTGTTTCTG A^TTTTGTCT ACCTCACAOS 
G ATGTTGTG A 03CTTGAAAA OSTCAAAAAA TGATGGCCCC TTGAGCTCTT TGTAAGAAAG 
GTAGATGAAA TATCGGATGT AATCTGAAAA AAAGATAAAA T2TGACTTCC CCT02TCTGT 
GCA'SCAGTCG GOCTG3AT02 TCTGTGOCCT TTCTTOG2TC CTCATG02AC CCCACAGCTC 
CCAGGAACCT TGAA02 '2AAT CTGGG3GACT TTCAGATGTT T" 2 ACAAAG AG GTACCAGGCA 
AACTTCCTGC TACACATGCC CTGAATGAAT TSCTAAATTT '2AAAGGAAAT OSACCCTGCT 
TTTAAGGATG TACAAAAGTA TGTCTGCATC GATGTCTGTA CTSTAAATTT CTAATTTATC 
ACTGTA2AAA GAAAACCCCT TGCTATTTAA TTTTGTATTA AA'GGAAAATA AAGTTTTGTT 
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(2) INFORMATION FOR SEQ ID NO: 73: 

(l) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1133 base pairs 

(B) TYPE: nucleic acid 
( C ) GTRANDEDMES5 : d DuJole 
( D ) TO POLOS V : 1 i ne ar 

15 ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

AGO ATT TTT C CTTGTTCAAC CAAAATCTGA GCATTCTTTC TATGTTGAAA AC AC TG AAAA . 6 0 

ACTAATTTWA GTTAATGAAC TAGAAAGAAT ATTGATTTTW AAGAAACAGA AAAATACTAC 110 

20 

TTATTTTCCT TCT CAAATAA CGTTTCTTTC AAAAAC TTCT GGCTGAAGTA TAACATGCTG 18 0 

GTAGTTAACA TAAATCTTGT CTTTCTCTTG TTCTTTATCT TTCTTTGTTA TTTAGATGCT 2 4 0 

25 TGTATAAATG TCTTTTCTTT TTATTAAGTG CCTAATTGAC AGAGCTTAAT IT 3AAG AAGT 3 00 

GCCCTAATTT ATT3ACSACT TAA3AATTGC CTTTATTGO 3 GTATTTTATT TGTTCCTGCG 360 

TCTTTTTG AT GTT 3TTCAGT CTACTCATCC CTGTGAGTAT GTGTGOSGGA CA3CTGATAG 4710 

30 

A A SGOAOjG AG AGT3TGTCTA TGC TO AGO AT T30CCTTTAG CCACTCAGCC AGAGATCCAC 480 

A-SGGAGCAAC AAG3ACAGTC TCACATGCTT AGACTTTCTT OSAAGAAACA GT3A3GAGGA 54 0 

35 GTAAGTCGTG AGTAGTGTCA AG- 3 TOG ATGT AGAATTGTCC T AAOGC AGTT GACCSCACCT 600 

TC 2AACATGT TTT OA C TTT A TTT3CCCCTC CrTACATTTG OSTTAGGTTC CATTTGGATT 660 

TGCAGCAATA ATT-ACTTTAT TTCTCTCTT3 GT2A3GATTT ' 3GC AC AT AAA ATC CTTTT AT 720 

40 

TATAGAACTA GOT ATTTT AG TTACAT AGT A ATGTAACTAA TGGAGAGATT TATAGAGAAT 730 

TTTGKTTTTG CTOIOATATA TGTCC ATTTT G3AGACAGAT AT' 3 AT AG AAC TAGAAATTAA 340 

45 GTTGCATTT S CO AA -T ATTTGAATGA ASTTCAAGTA TCTTOTTAAT TATTAAATTT 3)0 

TOTOATGAAG :• OOO v .TAA- ' AAATATATAG TATT ATT AAA TCTAATTAAT A" T TOG AAAT : )6 0 

ATT AA I AAAT AGGTATTTTA TTT ACT GTAA AAAG TO AAA. C TTOATTATGT AG AT AAAT CT 102 0 

50 

T AIT CTTTT 0 ATTCTTTOCC CTGTTT AO AT CCTTTTTACA AAOOTTAGTC AO C AATT AAA 1030 

GCTTTOCTAT CAAAOOVAAAA AAAAAAAAAA ACTCGAGACT AG TT C T CT C T OCT 113 3 
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(A) LENGTH: 661 base pair- 

( B ) TYPE : cue I e l c ac i d 

( C ) STRANDEDNESS : double 
(I 1 ) TOPOLOGY: linear 

5 

(xi} SEQUENCE DESCRIPTION: SE 2 ID NO: 79: 
GAATT2GGCA CGAGG3GAAA AGZATGCTGA ACGAGAG3AG AAAGCCTCTT TCCTTTG2TT 60 
10 CACGCCTTTC CAGTCTTTAT T* IT AAA ZTCG GGTTCCCTTT CTGTGGTCC-G AGCAACCTTT 120 
ACTCCACCTG CACTGZTGCT C 2TGGG3GCT COOZAGOrCT CCCT0TGO2T TTCTACCCAG 130 
TGGOTGACG 3 GATGCCTGTC TTOCCTG3AC GOACCACTGO TCTCCTGTCC CTCACCTTGG 2 4 ; "i 

15 

CTTTT3CTGT G2CCTG2TCT GG3GTTGAAG CT3G:CCATG TGTCCCCC'SG AGTCA'P >3CT 30 J 

GCTCOTCCTG C^3AG>:CTCT GT3T32GTCA ZGTCTTCCAO ACCT3GOG3G AGCT03C3AG 3 60 

20 CCCGT3CTCT GTTrCZCTCG GCTG3TTOG7 ACAGAGYTGC A3CCTGG3AY TCTCCGT3GA 42 0 

CCCAGACTG0 G3ATTTTGCC AG333003GA ■PG3GAGGAGC A3GT3CTTTG 0CTO3CG3CT 4 80 

GT3T7TGCAT TT0TO3ACGC GCCAGAGSAC AGAA3TTGCC G 3C ACTT P. 3A GGTCTTCCTC 54 3 

25 

GGCATGT3CC AGATTA-ZATG AGT3ACG3CT < 3G AAT ATGT TTTCTTTTP V GTAAT33AG3 6 00 

CGTGTTTCA2 ATATAGTAAA GZTCAOZAAA AAGTAAAAAA AAAAAAAAAA AAAAAACTCG 6 60 

30 A 661 



35 (2) INFORMATION FOR SEQ ID NO: SO: 

(1} SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 137 8 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

45 ATTGGGTACC GGGCCCCCCC TCGAAGTTTT TTTTTTTTTT TTTT AAT 3AA AOOT2T : 2AAA 6 0 

TAAGCGATTT TATTCCTATC CATGATTGCA GACATTTACA AAACCATAAC ATCTGAGTTC 12 0 

ACCTTAAAAA ATAA2TTATA TAAAGCAGTG ATATACACAG '2ACAAAATAG TTC A GG 3AGG 130 

50 

GGGCAGGAGC AAC TTGTAAT AATTAAAATG TAAACGTGAA AAAAA3GATG GAATAAAAGT 240 

C CCTA 2TTAT TTCTACTTAA GATGTCATGT GAT AAT ATT T PACAATGTC2 TGT3G3TCAA 3 00 

55 TGTATGTATG TGTATATGTC T3TATAACAT ACA2ATATAO AGTACATTCT CTTTSCCACA 360 

CATATACATA CACACATAAT TATTTGCA3T TCAGTTTAGG ■3CAATTCTAA TATGCCACTC 42 0- 

CGTACAGTTG TTTGAATCAC ATTTGGACCC GCTTTCTTCA CAAAAGAG3G GAGAGAGCAG 480 

60 
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GAAATAAAAA GGTTG3TTTG GTGTGACTGA GATTCCTTTG TTT AA C TG T A CACTGTGATG 54 0 

AAT AATTTTC TTCCG PACTA GTTCTGTGAA OGGCTGACTC ACTGTGCTCT TCATGAGGAG 600 

ACTTGGTAAT GGATCACACG CTCATTGTCA TOCTAGQ2GA GTAACTCTCA CTCTGAAAAC 6 60 

( ! 3 ATTT .AAG AA ATTTCC7CCC ATT TCGCCAT CArC2CTTG3 AGTGCCCGCT TGATTACTCA 720 
GXTCATATT ATTGGGAGAA TT 7 TP j* GAAA TACTGTCCAT ATCTCCTGAG CCTAAAGAGJ 
CATTCATGTG ATGTGACCCC ATT2CTCCTA ATCCACCCAT G.^GACCATCT GACCCAGGRC 
7 2ATTG7iAAA ATTAG2TCTG TTAG3TCCAG GAG2TACTG7 ATTCATTAAA GTATACATGT 
15 TATCACCAGA GTTGjTTGAA TCTG2TGOAC TAG2CATGAT GGGTGTTCCT 7GTGC7CCTC 

oacctcctg:; aggacctaca taattcccag gagatgctga G2agtatg3T a'I^gaat-pcg 1020 

CATTT21TGG CCTTTG2CCAA G3TCTACCAC CACCTGGACC CATGTTCATT C 2AGG2ATTC 1080 

^ r\ 

CAGGGCCAOC TM/vXArTC AC/IGGG2GT2 TCATTGCACC TCCATAGTTC TCTG0TC2TA 1140 

AGGG2ACCAT TCCTCTTG3A GGAGTCATTC TCTGCATTG3 CCCACCCATA TTTGCATGTC 1200 

25 CTT3TTGT3G AGTTGGAT2C ATTCCACTGG G3AGTAATGG CTGACTTCCT G3GACACGTC 1260 

CAAGTGCCTG ATTAG3TATC CTCAATGGGG G7CTTGGAC3 TCCAGGGTAC CGA3GTGACA 132 0 

TAAAAGGGTA ATCATOGAAG GCTTTTCCTT CACTTGAGTG TTCA2ATGTT TCACGTCT 1373 

30 



35 



45 



INFORMATION FOF SEC ID IIO: 81 



730 
340 
9 00 
960 



( i ) GEQCEC3CE CHARACTERISTICS : 

(A) LENGTH: 14 40 ba.^e pairs 
<e> TYPE: nuclei: acid 
12) STRANTEDNESS : double 
40 (2) TOPOLOGY': linear 

(>;i) SEO'JENCE DESCRIPTION: SE2 13 : 81: 

ac ttt; crcc a aat : rccx :tc tc A' : at ag tc a< 3CT< :;n ag :iaattt aaaa tg aattg 2- ": a 

AGTGAAGAGT CTG 2 > }ATTA ATTG3C73 r I l T AATTAACAG3 —TTATC2AAT GTGTOCT7AA 120 
GGGAGAGGCC CAACCZTAAT TAAGGAGCTA AACTTCCTGA GTGAGGGGCT < jTOAGGATGT" 
50 AG3TG3AG3A TGCATCTTGG G7GG3TCGTG OCCG3GCCAG CAGATGGCGC CTCCCTG3CT 

GAG" TGCC ~G C ACCGCCAGT TC C CT '2 ATTT CCACT 7 AG 1A A^ >:XT.AGAGAA 7G Z AG AGTGA 30 0 



6 



180 
240 
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TTTTTTTTTA GTTTTTACCT TTT C TT AATT ACCCTTATTC CGAATGGACG AACACTTTCT 600 

ACCACTGCTG ACCATTGTAA AAT AC C GTGT ATATAAATCC '^T'TCAaATA ATGOGCTG-GA b f (' 

5 

AT AG AACAT G TCAAATG2TG GTT AA' !TACA GACT : AGGTC GATTACTTGT ATTTOATGTA 72r 

ATGTTCCTCC AAGTTAGACA TCTGGT<3CAA GACCAACCGG '3AGACCATGG AATT3TCAAA 7o0 

10 AGTACAAACT GACAGTGTGT ATATTTAATT TAAAGACTTA TTTAAAAACT C A G AAGCTCT P40 

CACCTA.GACT T I s 3<3AG AGC A GTCTGTTTTC TGTAATGTCT ; "IATACT AG AA ACTAA.TTTGC !>0ii 

TTATTTTAGT TGTATT'CAAG ATTTGAAGAT GTATTTTATA i^ACAAGTTCT GTTTTTGAAC '^ni: 

15 

TTTGTO^AAC T^^.-r^ATC Aj ^ tc . VvTTT( - ; oi'AGTTAT jA TG ACTA' ETTA CATT ATG AAT 102'.' 

GT A' rAAC CC A C jA 3A TGATTT GTAAA' JCCGA ' GA 3TATGTTT CTATT.V 2A; "A AG A( TTTT'IG 1 0 R 0 

20 ATACAGCGTC TCTTGTCTTC ACTGATACTG GAGTGTGGGT TGTCTO^X', GTCGGTTCGA 1 L4" 

GTT : GT A3 TT AG AG AG AC AA TGAT A 3TGT 3 ATTT r ATTTT T AAT AT 53 AT ATG- TAT 3 AA 1 2 0 0 

ACTGTGATAC A GTT AT AA TT CACTGGTCCT 3CATCAGGAG ATGGAGTG3G GAAAACTGTA U:{> ; > 

25 

TTT AAT AC A G TTTGTATCT3 AAT AAT 3TGT ATG3TTTATA 3AGTTT 3TGT TGTTCAGAGA 1320 

TGTTTAAAGT TTGATCTTTG TTTTTCTAAA GATTAAAAAA GCACTTGI'CC CACTGTAAAT 1380 

30 ATACAGCATG TAAAATTTCT RTAGTATATA AATO3CAG3A AATCACAAAA AAAAAAAAAIJ 144 0 



3 J {2} INFCFMATIOrJ PGR SEQ ID MG : 82: 

{ l ) SEQUENCE CHARACT ERISTICS : 

(A) LENGTH: 13 81 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS : double 

( D ) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 82: 

45 CCCGGGCTGC AGGAATTCGK YACGAGGCCA GCAGTTGCTC CCAGTTGAGj AG3TGCTCCT 6 0 

GTACCCTGGC CACAGCC3AA TCCTGCCA3T -GCTGACATCT GSGGAGACTT TACCAAATCT 12 0 

ACAG3ATCAA CTTCCAG2CA GACCCAGCCA GGCACAGGCT GGGTCCAGTT CTG AC CTG AG 130 

CACG3TTTTT CCTCATGTGA CTTCTGGGAA GGCGCTCC C T CATCTGGGCC AAAGGAAG3A 24 0 

GGAOGAAGCC CTCCTCAGCT GGCCTGTGTT TGGGGCATGA ATCTCTCCTC TCrTCCTTGT 3 00 

CTGGCTCTGT TGA'CAAA 3 GG GGCATGTTTG GC AGTAAATT GGCACCGTGT CAC AC TG TTT 360 

CCTGGGATTC AAGTAT-GCAA CCAGAACACA GG A3AAGAAA AGCTCCAGGA TCC3TGTC3C 420 

CAT CT3TGC T GTTGATGTGA GAGAGAGTCT GAGACTTCTT CCATCGCAAT GACCTGTATT 480 

60 
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AAACACAAOC CCCCGAAGCA AAAGAA : 0A< >0 TTGAGTTTGG TGCCAGGATT C AG AT C AG C C 540 

CTTCCCAGG3 TCT 1 0C AGGTG TCAGATGATC AGAGTTGAGC GGGAGG7TTT OCGTA2CCAC 600 

AGTGGGTGTA GOA 2TTCAGT CCATCTOOGC TG G AG A< 3G A G GGTTTCTTCG TG ATTTTT AG 660 

CAG jTTTAG A GGCTGCAGCT TGAOGOAGAA T2AQJAGGGA A A 1 TOG AAG0 ATTAGGAGCT 72 0 

TTT AAAAAT G TTTAAATATT TTGCTTTGOT AATGTG2TGA TGCGCAGTAA CT 2ATGTTTG 7 8 0 

CAAAAGGA^C TGOTGCCTCG G0CTG:OCCA GOT GGG3CCT CTGAAGGGAT TCGTCACTGT 34 0 

OGGOAGGTGO COTGAGGITG AGGCAGCA0.T GTTOATGTGT GGICAGTTGT CTGGTTTCCA 000 

1 5 TGTATTGTAG G0CAGGTAGG GAAGACAGAG 1 '2 2 AAG G Z O. 2*G TG0TGGAAGC CAGACGGAAC 960 

AGTGTTGGGG CAGGAAG3TG GATGCTGTTG TCATGGAGCT GTOGOAGTTG GG ACTCTGTC 1020 

TGCTGGTGCG CCTCTCGGCT CACATGTTCA GAGTGCAGOT CCTGGCAGAC TT:k:GTTTTC 1080 

TCTTTGGTGG TTT:TAA.^GT GG3TTATGTG CAAACAACTT GTTTTCTCCT TCAGOAACTG 114 0 

TGAATGGCTA GAAGAAG3AG CTOAGTAAAC TAG AAG TCCA G3GTTGCTTG GTTTACTGGT 1200 

25 TTAT AA GAAA TCTGAAAGCA OCTCTGAGAT TOCTTTTATT AA OTCAGGTG T C AGTTG AAA 1260 

GATTTCTTCT TTGAAAGGTO AAGAGCGTGA ACTC-AAAAAA GTG TTGGC C T TTTTGCOOGA 13 2 0 

CCAGATTTTT AAG AT AA. AA T AAATATTTTT ACTTCTGTCA AAAAAAAAAA AAAAAAATNT 1330 

C 1381 



30 



35 



45 



(2) INFORMATION FOR SEQ ID NO: 83: 



( i ) S FG >UENC E C I IARACTE R I ST I G S : 

(A) LENGTH: 17 06 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDIJESS : double 
( z ) TOPOLOGY: linear 



(;•;>) SEQIJENCE DESCRIPTION: GEO ID NO: H3 : 

actgcaccac tgcccaggtc tccccax:tgg atgaagacgt gotccatgag ciaaootc-gct CO 

AGCTCAGACT GGAGAGTAGC TTCAGGAAAA AAGACAAGTG CGGTAAQGAA AT C AG GOG C G 12 0 

50 OIAACTATCA TCTGAGGGCT AAAGATGAGA AGTAGATCAC TTAATAAGAC AAAAGCCTGT ISO 

AGOGG^AAAA GAAAGGA0GT TTAAAAGGAC AGAATGTTTG CCAAOGTAGA A AT GA G AC TG 24 0 
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CCTGGGGAAG CATCTGATTT AG AAATGTGG GTTAGTGTCC AGAGAATGGA AAAATAGACA 54 0 

AGAGTCAAGG CTGGCAGGAT AACCTGTAAO AACAAAGGGT T FG AAAAATG AC/ 'jTTT'OGGT 600 

5 

TAGGAGAGGG AGAGACAGAT A<G:0AGAAA0 AGACCAGTGA AGA<3GAGAGA AAATGAGTAA 660 

agggagagot aattcctttt cgagtg/oaaa atgagtgata ttctggacat tcttc a gagg 7::o 

10 CATCTACACG AAGTAGAAAT GTCACCGCTC CCTAATTTAC TCTACGTCTT CTAGAATCCC 780 

TCAATATTAT CCTTGGCTT 1 : CAG3AAATC.: AAGAAGACCC TGGAAGTAGA GTCCACCTTC 840 

taagagaoga atgtaagaog tgacccccao cgac :tgatc ttcctcggtt tgtccactco c <oo 

15 

AGGCA:TGAG ACTT3AOACA CCTAGTGGCG ACCTAGAACG TAGGTCCTTA AAATiTAOGO 'J GO 

CCCCAXTCCG CAACCCATCT GTA3C2T3TO CACTCACCTG GTGAGGAACY TYTC 3TGTGT 102 0 

20 CCACA3CYTT CTGCAG3AGP ?»:MCMJ GCTCATAGA3 CTCCCAOCGA GTCA3GTCAT 1080 

GAGTGCTTT3 GGGGAGAAA J G>3AATGTTA TAGTG 3AAAA GAACAGAGC/ j AA rCAACTCG 114 0 

ACAGAG AC-OA GTAAAAACG 3 OAT3Q3GAA3 AGGA03AAAG GCACTGACTT GTAGAA3GC7V 1200 

25 

GAGAG3CGTT TCAGAGTG3C r 33 GAG ATT A TATA 70TCAT :CT-3ATCTAG GAAGGACGAC 126 0 

TGAGAA3GAA AGAAGATGGA 2 AAT A SCAT* V TCCCCCAGAA GTCAT7AGTG GACATCCCGG 1320 

30 GTCTT OCA'GC CCCTCC2AC2 CTTGTTTGO j GTGTCCCATT 3TCCAGCCC2 AGCTC 3TACC 13 8 0 

TGTAACAG3T CTTCAAG3TC CTGCT3GAAR CGGTCAGTCA ( 3CAAATCTAC TA3CTG3CT3 144 0 

03GOCAAAGT 3CGCCC3GOr G.AA.GAAAGTG AATTCGGGAT TAG AG A GO A- 3 GTAAGAGCAT 150 0 

35 

GG03G3CAGG CTCAAGCA3 3 G7TGGCT3T3 CATGCTTCAC CACGACGTC3 T03 AGTTGGT 156 0 

GCAG3AACAG CTCCAG3TGC TGAGAAGAAA AGGCAGAAGA TGGTGT3CT3 TO 3GG ATOG 3 162 0 

40 AGGAGG AC AG TCTTCTCtGCG G3AAGTGGAA CGGGGTTAAA AGCATTAAAC TTCAAGGATA 1680 

AGATGCCTAA RAAAAAAAAA AAAAAA 1^06 



45 



(2) INFORMATION FCR SEQ ID NO: 84: 



(i) SEQUENCE CHARACTEFISTICS : 
50 (A) LENGTH: 57 3 rase pairs 

(3) TYPE: nuclei: acid 

(C) STRATJDEDNESS - double 

(D) TOPOLOGY : linear 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

GAATTCGGCA CGAGCTTGGT AGO C TT AGAA CTGCATGAGC TGCTTTACCA CTGGGAAACA ST) ' 

CGAGCACAGC CTAGCTTGAT TTTGTATGTG GTATCAGATC TAAGGTGGAT GGAATTCAGG 12 0 

60 
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acttcctctc tactctttga ttttgtttta tttttagaaa tgttttattt tg tttt attc lr-0 

att t att cat cttcagagac atggtctggc to tgttgccc acgatggagt cg atgg tg tg 2 4 0 

atcatagggc actg:agtgt tgagctccco gc-ctcaggcg atcctcctgc ctcagctycc 3C-o 

TTAGTAGOTG GC AG TAT A 1 OG CACATGCCCT AC CATGCCTG G2TTTGTCTA CTTTTTGAAT 3 CO 

GATGTCTCAA ACTAGAAGGT CTATTAATTT AAAAAATTAA G 3 AT AGC ATG CCATAATTAA 420 

aaatga.taac agtgogaaaa ggcaccttcc aatgattoag acatcaactt gtgatttaaa 4 80 

AAAACGAAAA ATAAATAATA '3GAAAAAAA3 GCX3AAAAAGT T;G\ATAAAAA TAAAATTAAA 54 0 

aaaaaaaaaa aaaaactcga oggo3GGCG3 gta 57 3 



V^W ^i'JI" ^ JL'^lJ iT WIN —^v^ i J'-' . -J ■ 

i l ) SEQUENCE CHARACTERISTICS : 

IA) LENGTH: 6 84 base pairs 
(B) TYPE: nucleic aci3 
;C) STRANEEDNESS : double 
(D) TOPOLOGY: linear 

:xi) SEQUENCE DESCRIPTION: GEQ ID NO: 35: 



CTCTTTGGGT GTGTCTACCT CCTTCATCTG CTGCGCCGAC ATAAGCACCG CCCTGCCCCT 60 

aggctccagc CGTCCCGCAC GAGGCCCGAG GOAOCGAGAG CACGAGCATG O 3C AG C AA 0C 12 0 

CAQGCCTCCC AOOGTGCTCT YCACGTCCGT TATGCCACT A TCAACACCAG CTGCYGC02A ISO 

GCTACTTTGG ACACAGCTCA CCCCCATGC-G OO3CC3TC0T GGTGOGCGTC A3TCC O'GACC 24 0 

CACOCTGCAC ACCGGCCCCA GGG2CCTGCC G:2TGGGCCT CCACA2CCAT C3CT3CACGT 300 

GGCAG3TTTG TCTCTGTTGA GAATGGACTC TAO3CTCA0G CAGGGGAGAR GGCTCCTCAC 3 60 

ACTG3TOCCG G^CTCACTCT TTTGGGTGAG CCT^GGGGSC CCA3GGCCAT GGAAG3A3 3C 4 20 

TTAGGAGTT2 GATGAGAGAG AG2ATGAGG3 C33P3CGCTT TCCCCCTCCC AGGC3TC2TG 4r0 

GGTGTGATCC C3TTACTTTA ATTGITGGGG T SWA LOG, 3 T0T2C3ATAG GT3TCTGGCC 540 

AGGCCCACCT GSTGCGGA'TG TGG TC TGTGT G3GTOTGTGG G0AGA-GGTGT GAGTGTGTGA 600 

G1GACA3TTA CCCCATTTCA GTCATTTCCT GGTGGAACTA A 3T 2A'3C AAC ACAGTTTCTC 660 

T c. ; A AAAGVAA A AAAAAAAAAA AAAC 6R4 
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3-10 



10 



15 



20 



2D 



30 



35 



40 



ttg:agtgag ggcttctcct 

ACAA 1 jOHAGA G jAG" ^CG jA 
T , ^TGAAATO~ , TTTA^TTTTG 

CTCTCCC 2G0 CCTCCTCTGT 



02ag jGactt 
>:tgaooaaog 
:ttactggtg 
o:ccgaoga: 
wgtctttccc 



<B) 'TYPE: nucleic acid 

(C) STRANTECNESS: dcuble 

( D ) TOPOLOGY : 1 l near 

(xi) SEQUENCE DESCRIPTION: SEQ XD MO: 86: 
TG* jA 1 ^ >7AG A TG : AC A< OG A< ^ AAAG OTTC 1 ?C GTC < :GC ACC • 7 TV TC AG A< 2 CT 

:gtc :cctcg cc 7Gccccoa gagctgccat 

rCTGAGOCCC CA3ATCACCT 
r A 70 A PjOi'j T 0< 3 2C 7CCG j 
7TGCA7CCGT G2G07TGTGA 
2CTCTTAAG0 CCCAA3GTO 3 
AGTG0AYTG7 T7ATG0CAGA TOTGTG3CAA T0TCTGG7TG 

Yc-rcrrr^cc Or ogytccoo: gcttgaiggt g^atgtoctc cttcctggtCC 
oo:tocttga gocttagtcc a:;googtcac tyctccca: :: coacctacct 
tt3tgag2gt ggv2agag2a goaaagtccc tgaaogcoot 2ag0oagtat 
cccaccttca g7tg7cct7g gat0goaag0 acccagcccg a7c7ctg0g7 
gtttgoa^at g7agytt7ag g0 n atto3g3a tg7aggttgt gtyogagotog 
taggg3tagt t0gcttg07c ttctctttg3 tgatcccacc cccag2catt 

GCCCAG2 GOC T 3G7CTGG3G G02G2G3AGA GOG A OC AG AA GG0G7TGGG7 

G AGO ACT OA 7 GAAOTGO 00 3 0rO3A2AGTG0 GTATGG7G0 0 TGAGOOAGOG 

GTTTGA7TTC 2CG0GATG0G TG7TIG7TTC TCAGOTGTGT CCGACCCCAC 

AACCCAAAG2 AA 1 0 AG 0 AAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
7CCNG3G3GG GNOCCG 



GAGGOTGAGG 
CCCTGCTGTT 
G3GGTTCCCA 
AAGAA3GACC 
CCTO7AGG0G 

GrrrooGCAG 

GCAMCTGCGT 
CGGTCAGATT 
CA7AG3GTTG 
ATAG3C-GCCG 
ATAA7ACTGT 
CCTGG7AGAG 
TG2ATTGCT3 
AGOGGCGGTG 
GOO OTOCTGT 
CAT3TAATAA 
AAAAAAAAAN 



120 
loO 

2 4 0 

3 CM) 

:-6o 

4 2 0 
4 : ^ 0 
c .40 
600 
66 0 
' 7 2 0 
7 8') 
84 0 
90 0 
060 

1 XI 0 
1036 
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(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH * 90 8 base pairs 
50 ( B ) TY'PE: nucleic acid 

{ C ) STRANDED! JESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

55 

TTAAACAAAT GGAATCATGC AATATGTGAC CTTTTGCGTC TGGCTTATTT TATTTAGCAT 60 

AATGTTTTTG AGOTTCATCC AAGCTGTAGC ATGTATCAGC ACCTCATTTC TTTTTCTGGC 12 0 

60 TGAATATTAT TCCATTATAT G OATTT ACC A CAATTCATTT ACCTATTCAT CTTTTGTTTC 18 0 
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TGCTGTCTGG CTATTOTGAA TAATGCTTCG ATAAA2ATTC AT AT AC AAG T TTCTATGTC/J 240 

CTTTATGTTT TCATTTCTCT TOGCTATCTA CATGG- XAGTA GAATTCTAGG TCATAATATA ;>0( ! 

5 

ATTTTATGTT TAACTTCTCA A^GAATTGCC AAAAGSTTTT TCATAGTOGC TGCATCATTT ; : . 6 

ACATTCCCAC CGGCAATGTA CAAGGATTTC TATTTTTGCA TATCCTT02A CTTACCAACA 4 2 f".- 

0 CTTCTTTTTX GTV/ATWATTT TGTTTTTTCA TTATTGCCAC CCTAGT<2GAT GTGAAATGGC 480 

ATCTTATTGT TTTGATTTGC ATFTCTCTTYA TGACAAATGA TATGATAG'IT TTTTTATGTS M" 

CTT AC<SG ATC AAA< 3GTATTT ( :GTT- SG AG AA ATGT 1 2 2CTTC AAGT 2C TTTG C C ATTT < -AAA 6 0" 

ATTTGGTTAT TTGTCTTTTA TTATTGAGTT TTAAGAAATT CTOGCCASGC GC AGTOSCIC 660 

ACCTGTAATC MTAGrACTTT GGG A 2,GCCAA aSCGSGCAGA TCACTTCCAS?; TCAGGACTTC 720 

GAGAC2AGCC TGGCCAACAT GGTGAAACCC CATCTTACTA AAAATAGAAA AATTAGCT3G 7 8'?' 

GCGTGGTGGC AGGTGCATGT AAT'TNTATCT ACT'2A4GAGG CTGA^GGAGS AGAAT03CTT 84 "> 

GAACCCAGGA GGOSGAGGCT G-2AGTGAGCC AAGATCArGC CATTGCArTC TACaACTGGGT 900 

GACACAGA 90^ 



15 



ZD 



30 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 



( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 65 5> base pairs: 
35 (B) TYPE: nucleic acid 

( C ) ST HANDEDNESS : doubl e 
<D> TCPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

TGCA2TOGTT CCTTCTCCCC AGC AAATACT GCCTTCTTGT TTTTSTCTGA TGTGGC A GGT 6 J 

■3 AC TAG AAAA TCCGOCTTGG TATTCTTCAA AT SCAT AT AT ATTCCTTTCT r I v ■vrCAGCTCC 120 

GOGC AGCAAA AGTTGTTCCA CAGTGGAAAW TTAGGCATCC TCAAGTTT7Y TCCCAGCTTC 301 

TGCTGTGTTT TCTTAGAGTA AATTGCCAAT TTCTGTTTTT ACAGGAAATC C^TTTTTTAAA 3 60 

AATGGAATC A OTGTGGTCCC CATCTACTCT ('ACAAAAATTG CATTTTTO'T 7 TATTTTCAAA 42 ) 



60 



WO 98/54963 



342 

AAAAAAAAAA AAAAAAAACY GRAGGGGGGC CSGGTACCAA TTCGCCCTAT AATGA 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS. 

(A) LENGTH: 1102 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : 1 inear 

(:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 
TTTTTTTTTT ACC ATTT AAA ATAAAATGAA AGTGACCTTC TGTTTATAAA AATCTTTGTC 
TGCATCTCTG CTTATTTCCT TAGAAGAGAT TCCAAGAAGC GGTGAGTGAT TTCACGGCAG 
CAGAGSGTTG 'JGACATATTA CGGG03O3GA TCCCTCTTGG AGTGAGATGA CTCT'XO^A^ 
AGATTTAGTC GTCA3CCTCG CGTGTGA3GC TG3GTCACAC CC3AG3GATG T3TCTATCAA 
< 3 ATGGAAG AT CTTTTACACG CTCTTGATTT TGTTTGSCTY TTTTTCTATT ACTAGTGAGA 
AKGAAACTTT TTATATGATT ATT AT SCAT'S ATAATCCAAC ACAAATTACT G3TTSAT3TT 
CTTTTACTTT CCTGTGAAGG TTTTAGTGCC TTTTAAAAAT TG STAT AT AT TASV3CTTGTT 
AATACTTCCA TGCT3TATTT GTGGSCATCA RTTTCCCCGG GNA'SA 3GCNT GCA'SATTTTG 
CCTTCACACG CTGGGTGGTT TTTC ATTTT 1 S AMTTCTATTT CTSGTTCTTC TATCGTTTTA 
TGTTCAGAOG GSTTTCTCCG T 3T AGAAAGO AGTTT AT" 3 AA GATTTACTTT CG A SAGTCTT 
CTCTCTACTT TCTACAGTGA ATTCTCT3AT GTGTCTG3GA GTTTG3GGGT CTQ3GTAAGA 
RTCCTCCTCT : SACCCTATTC TCTATTACGA TCCACAOSCT CATGSTTTAT GARATTGGTS 
GC CGGGARC G 3G3GAGATTT GCQ3ATCCCC CAA-3CCAGAC TTTATCCCCC TATCCCTGCC 
TCTGGATCCC ACGTACAGGC CTGGGAACTC SCTGTGGGTA GGGO 3 C AATG GTCTCGCACT 
CTCACCTGTA CC3CAGGGCT G3C AC AGS AT SGTSAAGSAG AG AG 3CTGCC CAAGCGCATS 
CYTCTGGTGT CCCCCTGACA CGCCTC3AAA GTGAGCAGGT AGGTTTCAAC AGCCCCACGT 
TGCAGGT33G AGATGAAGCT 0AG3GT33AG ACC AGTAT ST CACAGTTCTC TTTGCATGGC 
CGGGTACTTG TTAGTCAACT GATCAAGTGA AAATTCTAGC CCCAGAGGCA GGA SAATCCG 
GAACAAAATT AAACCA'GCCA GG 

(2) INFORMATION FOR SEQ ID NO: 90. 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 153 3 base pairs 
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(h) TYPE: nucleic acid 
(C) STRANDEUNESS - double 
(!■) TOPOLOGY: linear 

(xl) 5 EQ17ENCE EESC? IPTIC >N : G EQ I O NO : 90: 

G3CACGAGCC 3HCAGG3GCA GG3GOOGATA GCG2GA3GGA C3 20 3TG3CA G3 3GGA3GCG 6 0 

G3GGT3 3AGG TTATGGATCG AGOG 03O3G3 OGOO'3GGGCG T3CTOG3 33G GC 3 3TGCGG3 120 

T3MGTG3T3G TG3TGAAGC3 Cr3GC 330 33G AAG3GGAA3G 3 3TTGCA3 3T CTT3 3 3GAGT 18-1 

0A33rGCAG3 CCCTTTT333 TGA303TGAA AT3T3CTT3A 233TGAP33T CA3T0A'3O33 24-) 

0 33AA0CAGG CGO3GGAR0T G3TGG3GT3G GA3GAGGT33 G3G33T33RA CG3T3TG3TG 3 0 'i 

GTGATGTYT3 GAGACGGGCT GATOGAG3A3 GT3GTGAA 1 3 333TT0Ar33 AG 3 33*3 3TGA 3 6-' 

GTGGGAOAOC G3CATOCA3A AG2GG2TGTG TAGGGTCG3A 33AG3GT0TG GCAAC3CSCT 42 0 

'3O3A33rrC0 TTRAA 3 3ATT A TOOT' 3*0 3T A TRAG3A33T3 A3 3AAT3AA3 AOOT3 3TGA3 4 8-' 

0AAGTO3A0G OTATTOCTGT GCCGCCG3CT GCTGT0A000 ATGAA 3 3T 33 TGTOTOTGGA 54 0 

0AO3GCTTG3 G33CTGGG3G T ( 3TTGT 3TGT GCT'3A3COTG G3GTG333GT TGATTGCTGA 600 

TGTG3A0 OTA GAGAGTGAGA AGTATGGG3G TCTG33GGA3 AT03G3TT3A CTCTGGGCA7 660 

CTTCCT3CGT OTGGGA33 30 rG3GCACCTA CGGCG3C0GA GTG33 3TA0 3 T30 3TOTAG3 7 2 n 

AAGA3TG3GT TCCAAGA3AG GTG07T3 3 30 0GTTGTGGT3 GAG2AG33CO 0 3GTA3ATG3 7 8 'J 

AGAG3TTGTG 0GA3TGGAO3 AG30AGTGG0 CTOTOA3TG3 A3AGTG3TG3 CG3AC3AGGA &4m 

CTTTGTG3TA GT03T333A3 T'3CTGGAGT0 GCAGGTG333 AG73AGATGT TTG3T3GAG3 £*0'.' 

GATGG33CG3 TGTG3AG3TG G3GTGATGGA TCTG'TTCTAC GTG03G03G3 GAGTGTGTOG 960 

TG30ATO3TG CTGCGCCTCT TOGTGGCOAr G3AGAAG3GG AGG3ATATG3 AGTATGAATG 102m 

CCCCTACTTG GTATATGTG3 OOGTGGTGGC CTTCCGCTT3 GAGCOOAAGG AT" 3 3G AAAG 3 1080 

TOTGTPTGOA GTOGAPOGGG AATTGATGGT TAG3GAGG3 3 GT03A3GG03 A3-GT-3CACCC U4u 

AAAOTAGTT3 TG3ATO3T0A GGG3TTG3GT V;.\';" 3 3 .' ' ; OGOAOGTOGA A33333A33A 120'- 

GATG30AGCG OOAGAA 3AG3 03TTAT'3AC^ OOT'»3OO0r3 G-77GTO-CCTT AGT3T3TAOT 1267 

TO3A3GA0GG TT3<3T0G'rTO GOTA3GGCT3 '3AG33CGTGT 0CACA3CT0C TGTG3GGGTG 1320 

GAGGAGACTC OTOTGGAGAA GCtGTGAGAAG GTGGAGG3TA TO3TTT0,GO3 GGACA3GCCA 138!") 

GAATGAAGTC CTGGGTCAGG AGG00AG3T3 G3TG3O330A GC'T333TATG TAA3GCCTTC 144u 
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(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 5 base pairs 

(B) TYPE: nucleic acid 

{ C ) STRANDEDNESS : double 
(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

ATCCTCTGGA ATCTAGGTG3 AAGCCACCAA GCCTTCTTCA CACTTG2GTT CTGAG2ATCT 

i 2CAGACTT AA CCC'CATGTO^ CAATCACCAA GGCTTATGGC TTGTGTCCTC '2AGAACTGTG 

G2CAGAGCTG TACCTGGGC 2 CCTTTGAG2T aAGGCTGAAG CC AG A 1 jTCTG AAG2TCAG3A 

G33CAGTARG G02CTGGG2C TGGCCC2TGA AACCATTCTT TTCTCCTAAG CCTCTGGGCC 

TTTGATGGGA RGGGCTGTCC TCAAGATTTT TG AAATGC CT TTGG A 2G 2TT TTTG2CTTGT 

< 2 TTG3AT ATT GG2TTCCTTT TAGTTAT32T CATCTCTCTA GCAAGTGAAT GTT'TCACAA.C 

CTGCTTGGAT TCTTTCTCTA CCA2AGARCC AG3CTGCAAA TTTTACAAAC TTTTA 2ACTC 

T3TTTCCCTT TTAAATATAA ATTTCAATGT TAAGTCACTT CTTTGCTCCC ATATCT<3ATT 

TAG3TTGCTG « 3AAGT AG2C A AGTCACCTCT TGAATGCTTT GCTGCTTAGA AATTT02TCT 

ACTAGGTAG-C CTGGGTCATC A'CACTTAAGT TCAAA 



(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : ' 639 base pairs 

(B) TYPE : nucleic acid 

{ C ) STRANDEDNESS : double 
(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

TCCTTT«2ATC TTAAGCACCA CCCGACAG3G CAGGTACTAT TACCATCTCC ■ 3TTTGACAGA 

TNAG3AACCT GGCACAG3AA GCATTTAAGT G3ATTCCCCA GGATCG2CCC ACTGT C AG GA 

G2AGANTCAG AATGGGCCTC AGCATCAGGC TCCCAATCCT GGCTTCTAAC TGCT303CTC 

TGCCCTTCYC TCWCCCCACC TCCCCA'CTCC AGTGCCTTTG GTCATGCCAC TG 2 AGCTTTC 

AGGC C AATAC TGGATTAGCC TCTTAGTGTT CTTGTCCCTG CAGC GATTT 1 2 CCCAGGCAG2 

AATTCCATGT GCCCTCACTG ATGTAGGTG3 CTCTTGTGTC ATTTGTCACA TC C T ATTGAA 

TTGTTTATGC ATCTTGTTCA CACTCACAGC ACCCTCCCTC TCACACGTCC TCCTTATAAA 

AATGTCCCTC AGTGTCTGCT ATGAGCCAGG TGCAGACTTA AGTGACAGG3 CTG2TACGGG 
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AAATAAAAAA TTAACAAGGA GCACCTOCCT CTT AATGC AC AGTAACAAAC TATGTTAAGT 5 4 0 

gtcaggaagg aaaggttaag gatgccagga ac^g c tttt aa taaataacc:t gacttagatg 6 00 

5 (x;caggtogt gctgargatt aagaacgtgt tcttctcga .• * 



10 GG I M F ( j EMAT I CT ■ J FOR GEO. ID NO: 93: 

{ 1 ) £ EQ'JET ICE C HARACTE RICTIC3 : 

(A ) LENGTH : 744 base pairs 

(B) TYPE: nucleic acid 

1 5 ( C ) STRANDEDNES S : d o uh > 1 e 

( E ! TO PC 1 L CGY ; linear 

(xi) SETCEGCE EES 2RI PTI ON : GEO ID NO: 93: 



20 OA^TTCGGCA CGA* 1 ACT 30C TG3AGTCTOG CTCrCAGAG^G AAG AG AT C AG CAGO0AGGGA (-0 

'OOOAGGOOET GTCACATCTT TCCTCTGGOC A' ITGTCCT GG TCTTTGTAAG CCCAGAATCT 1G 0 

CCCCTTC2GT GAAGG3AGC;G CA3CACCCCA 1 G3AGGGCAGC AGGTGTGCTG TGA03GTTGG 180 

25 

AGTAGTGTCA GAG3TCAG3G TAG ACT AG, \A TG3CCATGGA CACCATGTGG 2 40 

GG2TGG32CA CAGAACAGTG TCCTTCCT3C T CO TG CTC 02 ' 2T<3C AGCTTC CCCCGACCTT 3 00 

30 GTNGTrTATT TGGTTTGATA CCAATCAGCA GACCCTGCAA G0TG3AAGCT CCCAGGCTCT 3b0 

CAGTCCCAC3 ACTCTCATGT GCCAGTCA ?C 3NTACTGTAA CTGCCCAAT3 AGTA3TTCTT 4 20 

GOOCACTGCC AAGAT\GAGC CA3TTTACA AG ACA' GGG3 A ATTGCAGTAG AGAAAGAGTT 4 - 0 

35 

GAATATACAT AG A GG 0 AGCT AAATGGGAGA GTQ3AGTTTT G TTATTAGTT AAAT0A3CCT 54 0 

GG 0YTAAAAT TGAGA 30TGA GAATTT1TGA A 3GACA3TTT G3TGG SG AG3 C G T AGG 3 AAT 6)0 

40 G3AT3CTC-CT GATTG3CTAG G3ATGCAATC ATAG3GGTGT AGAAAAGTWC C TTGTG 2ACT 66 0 



GAOTCCAGET T'OGOTOAGAG CTACCAAGOA O0TGCTGGTC TGCTOGTCCC GGTAGAGCCA 
TGTC-GTG'TCA GGAAT ; AAA AGTG 

45 



(2) INFORMATION FOR SEC- IH> NO: 94: 

50 

(\) SEQUENCE CHARACTERISTICS: 

( A ) LPT I' ^TH : 5 2 6 ba s t ; pair 3 



60 
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AAGCC 3ATAA GT3CTGGCCT GTTGGGACAA 
ACOGCAYTCA GCCATCYTAY TCCTG3GGAA 
5 GTTGTAAAAC TG 3 AAAAAAA TTTTAGAAGA 
TAAAATGTAG AAAACTAAAG CA 2AGAGATG 
GTTAGCARGC TTGGTCTGGT gacctttcta 

10 

CAC-CACAGAT G3CTGCTGCT ATA3CTG3GG 
CCCAAGTTCC CATAGTCTAG GTTCTGrTTC 
15 TC:CTTACCA 2TCTACCAGT GCTGGG3GAT 



346 



ATGA3AGAAA TCCCATAGGG T3GTGATGAC 12 0 

AATGAAACTT GTGCTCCTAT CAAATGCTCA 13 ) 

CATCTTGTGC AGCATCTGTG TTT ATGTC T A 240 

TT AAATG TTT TGTGCAAGGT CGAACAGCTG 30 0 

CTGAACCACA GTG2CGCTGG GGGAAGTCCT 3 60 

TATG3G2AGT ATTAGTAGTT AAGCAGTCAA 42') 

AGCTGGAGGT TAG33AAAAA CACAAGAAAA 4 3'.} 

GTACTAAGAG ATCCCC 5 2^ 



20 (2} INFORMATION FOR SEQ ID NO: 95: 



25 



45 



50 



55 



60 



{ i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH - 42 6 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPGLCGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

30 GOrACAGGGC AG3AGAGACT TG3TCCATGG GGAGAAG2CT GC AGT AT AG A TGGGACCTCC 60 

ag^ag:cgaa GTACXTATAGA CCCTGCTGAT CCGGGGCCAT TGAGCCAGAG GATTTGGGCT 120 

GAATGTCCGC AGAGACAAAA G3GAAAGGTA GATCCTTTCC CTTAAAGAT3 AAAGCCATCG ISO 

35 

CCCG^GGTTG CTTATTGCTC TCTCTCCTGG TCCTTCCACA TGTTGTTTCT GAA< IATTTGT 24 0 

TCTG3CAT2A CAATCCCCGT CATCCTGTCA TCTGGCCCTT CCCACCTTTC CACCTTATCT 3 00 

40 CTTGCAGTGT CTCCGCGTCG ACCTGGCACC TGGGTGAARG CTTGCTCTIG CTGGTGCCCA 360 

TAGCCCCCAG TGTAT3GTCT T3AMCTCCCC AGCCATATGG ARACCCACCT CAGGAGGGCC 42 0 



OCT 



420 



(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 84 4 base pairs 
{ B ) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 
GGCACAGCGG C AC GAG AT AG GAAGCTTGGC AGGG3CAGCT CCCCCAGTGC GCATTGCCCT 6 0 
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10 



30 



GTAACTCGAG CGCCTGGGAG TGGG3AGAGG CTTOGAAATG GAG2AGG-GTG GTGGACCTCG 



TCTTCTCCT2 

c::cagcccaa ggaacaactg agaatactga gt3c:agggt agccctagcc ccatttcaca 

CCTGGK3CAAA 3TCAGGTCAC TGGATTCAAA CA:TCAGATT TAAACCTCCT CTGTGTCT3C 
AGCACCTGTA TATAA ^T3CC AGCCTCT3CT GCCCCTCI CC AAAAAGTCTC TGCCCTTOTC 
TTTGGCACCT GTCTCTCTCC TCCCCATTCT CT3CTCCTCC TTTCTCCAAC TCAGANTCAC 
CCTGTTAGTT CAGCAAATGT TCATCGAGCT CCA1AATOTA GCACS3ACAGG NCTGTCTAA" 
1 5 AGATTCTCGN CTTGCAAGC'G TGAGACAAGT ACTCTCCATC TTTCTCTCAT CTTCACAC-AT 
GGTCTGCTCA ACAACTTT< >C ACTYiAATTGT AAATAATTGA TA'.TGCATAA AACATTGATG 

_ , , „, mi ^,,> ^ ^^AT^AT r,arsAr rT r.r? CAAC-GATCTC 

TTCTTTAACiG G'l'AG .'G< AiA- >VfV..j ji^'v\ vj . . - ^ - - - ~ — ■ - 

20 

TCAGTGAAGC ATTTC^^GT GGTAGGTGTG CCTATGGGTG AG 3TCAGCT A TC TC AC G 2 CA 
TCTACTTCCA CrTTGCC JCCC CAP2CCA3GC TCACCCT2AG CT jAGATGCC TCAGCA3GTG 
25 GCAGAAAGGA G2CA2CTQST TTATCCTTCG GGACCACAAA CTGCTGTATG CAGANCACA3 
TTTT 



(2) INFORMATION FOF SEC ID NO: 97; 



( l ) SEQVEIICE CHARACTERISTICS ■ 
35 (A: LENGTH : l fi 35 base pair5 

(3; TYPE: nuclei: acid 
; G ; STRATI" EBtJKSS : doub 1 e 
(G; TOPOIGGV ; linear 



120 



GAT 2C TAG GC2TCCTC2A TAACAC CT AC CTA5CACC-2.C CTGGGGACTT 18C 



240 

300 

3 60 

420 

480 

540 

600 

660 

72 0 

780 

840 

844 



40 


(XI) 


SEQUENCE DESCRIPTION: 


3EQ ID NO: 


97 : 










AGCCCTGOTG 


AAGT At GAGGT TCTTCTAT : A 






GAC 


aGA-\CACG 


6 0 




AAAG^AGATG 


AG03ATGAAT ATGI^XAGAO 


■ )CTGAG 2AAG 








12 0 


45 


GT2TTACCTG 


?Jr. -CG2CC G\ T : AG-/ > GIG 2 A 




C TCG2TOA 'A 




atoatgt 


180 




AATG03TGT S 


CAA2ATACAG G AA : ^ j AAA G j 


ATTCTY 2TGA 


AAGC CATCGC 


TC' 


:ggagoag 


240 


50 


GAACACCATT 


TTC AC ' 2 CT A 3 < 3 AAG G G GCG 2 


CTCTGTCATC 


TCCCCCACTG 


AA 


2TTGA2GC 


300 




CCCO ATCCTO 


OTGCCTC AC A G AGCG2AGC G 


SIAGAGCAGA 






v- ggtg 


? 60 



WO 98/54963 



PCT/DS98/114 



348 



GTTCCTGCCC TGGACAGGTA CTGGGGAACA 'GGTG2TTGCC TTG7TATGG2 CACOGTTTGA 660 
ACTGATCCTG GAGATGA-\TG TT 2AGAG2GT 2CGAAGCACT 'GACCCCCAG2 G2CTAGGGGG '7 2 0 

GTTGGATACT CGG2CCCACT ATATCA2ACG CCG2T ATG2A GAGTTCTCCT CGGGTGTTGT 73 0 

CAGTATCAAC CAGACAATTC CTAATGAACG GAC2ATG2AA TTG2TGGGAC AG2TGZAGGT 8-10 
GGAGGTGGAG AYTTTTGTCC TCOGAGTGGZ AG2TGAGTTC TCCTCAAGZA AGjAG:AG:T 900 
TGTGTTTCTG ATCAACAACT ATGACATGAT -GCTGGGTGTG 2TGATG3AG2 QGGCTG2AGA 960 

T GAG AGO A A^ GAGjTTGAGA CGTTCCAGCA G2TG2TCAAT G2TCGGACAC AGGAATTCAT 102 0 

TGAAGAGTTG CTGTCTCCCC CTTTT GGGGG TTTAGTGGCA TTTGTGAAGG AGG2TGAGG2 10 30 

TTTGATTGAG C J 2 GGACAGG C'IGAGCGACT TCGAGGG2AA GAAG2C2GG3 TAA2T2AGCT 1140 

GATCCGTGGC TT2GGTAGTT C'CTGGAAAT 2 ATCAGTGGAA TCTCTGAGT2 AGGATGTAAT 12)0 

G2CtGAGTTTC AC 2AA 2TT 2A GAAATGG2AC GAGTATCATT .2AGGGAG2G2 T2ACCCAGCT 1260 

GATC GAGCT 2 TAT CATC GZT TCCACCGGGT G2TGTCC GAG C2G2AG2TCC GAGCCCTCCC 1220 

TG2G CGGG22 GAG2TCATCA AC ATTC AC ■ 2 A CC TT AT- GGTG GAG2TCAAGA AG2ATAAGCC 13 80 

C AACTTCTGA TGTGCCAGAA ACCG2 2CTGA 'GATCTG2 2GG TCATGTCCAT GGACTTCTG2 14-10 

AGG 2CATTCC ATA 2CCTTCT TOACCTGGGG TACCCCTTCC AGTTTTCC 2G TTGCTTCCCA 1S0 0 

GGC2CTIGA2 ATGG2TTACC TGC7TTCACT CCCAGCACCT TGCCCAAGAG GATAAG2TGG 156 0 

AT 2 2CCTTGG 2 CrTCT GAA2 AT2CCAGTGT CTTCA GGTTT CCCAAGACCA CTTCCCTGTG 162 0 

GG2TTC2AAA AT G G 2 C TT 2 A T2ATTTCTCC AGTCTGTCAC CCTGCTTTCC TG2TCC I 2ATA 1680 

GAG 2GAAGGG TTGTTTCTTC CCCTGTAAAA ACCA2TGCCT CAATCTCTGG TTCACTCAAC 1740 

rAGTCACCAT GTCCTGAGGC ATGAAGCCTC CTCAGCTCTT QGAATTQ2TG G2 AAG GGGTG 1300 

ACTGCCTCTG AGTCATTGTG TTTTTCAAAG TGATTTCTTT TCTGTAGCTT TTTGACCTAA 1860 

G ATCTC A GC A AITTGAACAC TAACCTCTCC CCTCCTGG2T CAA 1 GAATT AC TCCGAAGTCA 1920 

GTCTG2A 2 AA AATAAATATT TAGTATGACA TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 198 0 

AAAAA 198 5 



(2) INFORMATION FOR SEQ ID NO: 98: 

(l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1416 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

( D ) TOPOLOGY : 1 inear 



WO 98/54963 



PCT/US98/11422 



10 



U9 



(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 
ATATGAAGGG AAAGAATTTG AITATGTTTT CT 2AATTGAT GTCAATGAAG GTGGAC CATC 
AT AT AAATTG C CAT AT AAT A CCAGTGATGA CCCTTGGTTA ACTGCATACA ACTTCTTACA 12 0 

GAAGAATGAT TTGAATCCTA TGTTTCTGGA TCAAGTAGCT AAATTTATTA 1TGATAACAC 
AAAAGGTCAA ATGTTGGGAC TTGGGAATCC CJVGCTTTTCA GATCOATTTA CAGGTGGTGG 
TC^TATGTT CCGGGCTCTT CGGGATCTTC TAACACACTA CCCACAGCAG AT C CTTTT AC 
AGGTGCTOGT CGTTATGTAC CAGGTTCTGC AAGTATGGGA ACTA2CATGG CCGGAGTTGA 
15 TC CATTTACA GOGAATAGTG CCTACCGATC AGCTGC AT C T AAAACAATGA ATATTTATTT 
C3CTAAAAAA GAGGCTGTCA CATTTGACCA AGCAAACCCT ACACAAATAT TAGGTAAACT 
C\ACGAACTT AATGGAACTG PArCTGAA^A GAAGAAGTTA ACTGAGGATG AC TTG AT ACT 
T I'TTGAGAAG ATACTGTCTC TAATATGTAA TAGTTCTTCA GAAAAACCCA CAGTCCAGCA 
ATTTCACATT TTGT3GAAAG CTATTAACTG TCCTGAAGAT ATTGTCTTTC C TGC ACTTGA 
25 CATTCTTCGG ITGTCAATTA AACACCCCAG TGTGAATGAG AACTTCTGCA AT G AAAAGG A 
AGGGGCTCAG TTCAGCAGTC ATCTTATCAA TCTTCTGAAC CCTAAAGGAA AGCCAGCAAA 



30 



40 



60 



180 
240 
300 

3 60 

4 20 
4 8 0 
r >4 0 
600 
bSO 
720 
780 



C CAGCTG2TT GCTCTCAGGA CTTTTTGCAA TTGTTTTGTT GGCCAGGCAG GACAAAAACT 34 0 



900 



you 



CATGATGTCC CAGAGGGAAT CACTGATOTC CCATGCAATA GAACTGAAAT CAGGGAGCAA 
TAAGAACATT C AC ATTGCTC TGGCTACATT «3GCCCTGAAC TATTCTGTI'i' GTTTTCATAA 

35 AGA2CATAAC ATTGAAGGGA AAGCCCAATG TTTGTCACTA ATT AGC AC AA TCTTGGAAGT 1020 

AGTACAAGAC CTAGAAGCCA CTTTTAGACT TCTTGTQGCT CTTGGAACAC TTATCAGTGA 1080 

T3ATTCAAAT GCTGTACAAT TAGCCAA 7TC TTTAGGTGTT GATTCTCAAA TAAAAAAGTA 114 0 

TTCCTCAGTA TIAGAACCAG CTAAAGTAAG TGAATGCTGT AGATTTATCC TAAATTTGCT 1200 

J T A GCAGTG3 G3AAGAGG3A CGGATATTTT TAATTGATTA GTOTTTTTTT CCTCACATTT 126 0 

45 jACAtgactg ataacagata attaaaaaaa gagaatacoc tgoattaaot aaaatttt^c 

ATCTTGTAAA GTO3TCGGGA GGGGAAACAG AAATAAAATT TTTG3ACTGC TGAAAAAAAA 



1 3 : 0 
1380 



50 



AAAAAAAAAA AAAAGG AAAC TCGAGGGGGG GCCCGG 141 ° 
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(xi) 3 EC-UENC E L E S C K I Pr I ON 

r jtctacgc r a atc aa 3 at G3 ggacatactt 

5 

agattg rcta aagtagtgac tg3aaatcac 

2'3aga\g3ac catgg3cat3 tg3acaatca 

10 zcccgg3A3a aagtgcagtg catcct3aga 

ct3gccaarg ag3actctgt ccctggag3g 

ttgataaagc, caaatccagc ctgttt G3tg 

15 

G-CTAGCTGTC TGTCT 3GAGA G3AGTCCTAT 

■ r roA 1 r r aa\a c g a tc 3 atg a -zc- 3 aaagt 3 a 

20 tgtta-\t:ag a;:aaa:agat ctgtgagaag 

< y rrrT 3' r atg ata 3tg za< 2 a g :at> 3 ag g 2 a 
atgag::taac: aagzasgttg tctcgtzttt 

25 

.v rrTTG' rr 3T o 23c aa 3 f ta 3 a 3 act a st a 3 

TCTAA3TATA aaaaacaaaa caaaaatctc 

30 ttgcttttzt ttccttttat tttaaaaaag 

a 3att aatg a g zatctaat 3 gaaa3aa 3gt 

< >2G'ita ( ;r 3g cattatgta' r aaata< 3 a r tg 

35 

aa r at aa fat atatttata r ttg afgt aa r 

A' 3T A TP 5G TG ATAGTAG G 3T GT 3ACATACT 

40 TAAG3T3AG3 TTAAAA3GAA AAAAAGAAAA 

AGAGTTTATT GAACTTTCTA GGTATG^3AGT 

G AAG 3T' :A 2 T AA3TC A3TT A G A 3ATCT "AG 

45 

TAGAGTGATT ACCATACATG AG ATAAAAAG 

TTCTTCTC2T CCGAAATAAT ATAC2TG2AGA 

50 AATTGTAArA TATTTTTGAT GATTATTCAG 

T jTC ATTTTT TAAAAAACTA ATTTGTATTG 

ATAAAGTATT TTAATCAACC ATACTATTCT 

5:> 

TCTGTGCCTT TATTTCCCTC TTCTGAAAAA 

T3TATATCAA TA\TTAATCA ^AATO^TT 

60 CACCAAAGZT TCAAG3ACAA GTGTTGTACA 



: 3EQ ID [JO: 99 : 

CG3'3AC'3AG3 TTCTTCATGA AC AT AT C C AG 60 

A3AG2TCTT2 AGATA CCAG A GGTTTATCTT 12 0 

1 3 AAA 2 2 A 3 3 A CAATAAGTGC TTATAAAACC 130 

ATGTG3TCTA 3GATTATGAA C 3T2CTGAGC 2 4 0 

GAT3ACTTTG TTCCTGTGTT G3T 3TTTGTG 3 00 

TCTA3T jT'X AGTATATGAG TAG-CTTTTAT 3 60 

TG3TG3ATG3 AGTTCAGAG3 A 3C AGTAGAA 42!) 

CCA\GA:CAA G3CCCACCA-\ G32AG2AGAC 430 

GTG3ATCAG3 T33TTT<3AA3 G:T3A\3ATT 34 0 

TTTTAAA3CA GATCTTTACT AAA2AG3TTA 600 

GG3GT 2TTTC CTTTCTGAGT TG3ATATTCT 660 

TA3AAAAA3G GACCAGATTT TTCAAGTATT "'2.0 

TTA3GAAATG TCTAGACCTC <G\TTCTT<3GA 7 30 

AA3AGTA3CC CTCTTTTAAG AT03TGTCTT R4 0 

ATGAGTTG3A CTGAG3ATTA < 3AAT A 1 3 T G 3T 900 

A 1 3 ■ 3 T AAA'T P< 3 AAAG3TAA3A AG3AAATGTA ''6 0 

ATGGACAT3T GGAGATTCTA ATAAY3AAGG 102 0 

GTCTTGTGAA AT f 3GTTTCCT TGACAAAATT 1080 

AGTACACAGA A^T ATTTA TP AAAATGTAAT 1140 

TTGATC^3ACA G3GCT 1 3CCTY TAATGAGTGT 1200 

CGT3GAV3TT TGTGA3CCTG (2ATTA3GAGA 1260 

GAACAGT3GA TAGCTCATAG TTTAT03T03 132 0 

AATCCOAGAC AGAG2TCCTT ACAAACCTTT 1380 

ATTGAATG3A CAGACCAA3A ATTC3AGTGAA 1440 

TCT3CTCTAG TGATACAA3T TTTACTAGTG 1^.00 

T AT ■ 3G AAAAA AATATCTATT TTG3CAG3TT 1560 
A^GTCTGTGT TTTCATAGTT TGGTTT'3CAT . 162 0. 

TTG jTGCGTG AAAAATTCXXT G\Tv33AGGCA 1680 

TG3G3CATCA CT3TCTGGTT TCACTTCGTG 174 0 
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TGTTTCCTAA ACA CATTTAG CTGCTTTTTT AACAAACTCA GO CCCATAOT TGAGTCCCTT 1300 

GTTGTTGGGA GCATTTCCAG GCATCTTTTA AG3GAA0TGT GA2AAACAGC CTCGGGCAGA I860 

5 

TGAACAC3GA GGCTCTCTGT TGTCTGTGTC T3AGATCTTT GTGT CTGG3A ATGC2TAAAG 1920 

OTTTT 1 3NTTT TTTTT 1935 

10 

(2) INFORMATION FOR SEQ ID NO: 100: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 599 base pairs 

(B) TYPE : nucleic acid 

( C ) STRANDEE>NESS : doubl e 
( D ) TOPOLOGY : I inear 

OA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

GAATTCGGCA CGA'^GTCCA CGCAGCCGCC OGCCCX3CCAG CACCCAGOGC CCTOCATGXTC -'A) 

25 A 1 3GTCGTT' 3G AGGTOGCAGC GAGACATGCA CCCGGCCCGG AAGGTCCTCA GCCTCCTCTT 12 0 

CCTCATCCTG ATGGOCACTG AACTCACTCA AGACTCCGCT GCCCCCGACT CCCT30TGAG 1-30 

AAGTTCAAAG iG3CA<3GACGA GG3GGTCTTT GGCTGCTATT GTCATCTGGA GGGGGAAGAG 2-10 

30 

TGAGAGOOGG ATAGCCAAGA CCCCAGGCAT TTTCAGAGGT GGCGGGACCT TAGTCCTACC 3 00 

CCCAACACAO AOCOCTGAGT COCTCATCCT CCCTTTGGGO ATAACGCT3C CCTTGGGG02 .3 '50 

35 TCCAGAAA2A OG03GTGGGG ATTGTOCCGC TGAGACCT3G AAGGGCAGCC AGCGTGCCGG 42 0 

GGA'GCTGTGT G>OVTT'GCTGG ■ TTTAATAT' 3C A'303CTTOG^ GGGCTGTGGC CACATGCCC 3 4 30 

GCAGGA-03TG A3TGAGGAGC CCTGTOG03T GCTGGTGT3G GGATCGTOSG CATTT'CAAAC 54 0 

40 

GGGCTTGTCG TACCCT'GAA 2 AATGTATCAA TAGAGAAAAA AAAAAAAAAA AAAACTCGA 59 9 



45 

(2) INFORMATION FOR SEQ ID NO: 101: 

{ i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 784 base pairs 
50 (p.) TYPE: nucleic acid 

( L ) STRANt EDNEOS : double 
( D ) TO PC LOGY : i inear 



WO 98/54963 



PC T/IS98 11422 



10 



15 



20 



25 



24 0 
300 
350 
42 0 
480 



CACTTTGGAC ATCCACTTGT C AC C AAAT C K RGTTCATTCT GGATCCTAAG TAAGTCCTTT 
GATTCCTCCA GTTGTTCATT AGTAATGTCT CAARTGTAAT TTTTTCTAGT A( jTTTTCAGC 
CTGTCTTTCC KGCCTTCAGT OTTAACTTCT C G AGT A CAT A EGCCACATTG TTGTCAGCAK 
GATCAWATTT TATTTAAAAA TA2TTTACAW AK 3TTTATKG CCAAATATTA GRAAATACAG 
ATTCATGGAA AGAAAAATCA CTOTCCCAAG GA3GTCACTG GCATGGTGAG GTTAAGGGGT 
GATTTTAATT TTTAAAAATG TATATTTTTT CCTGTGTAGA GTAGTAACAC CCTTGAAAAC 54 
ACAWTCCCTT GTAAAGTCTC TA\ T 2 C TO T A CTCCGOATCT AGSTGRTCTC TTCTTTCTCA 
< 5 AT ATTTT AG AATTTCATTT ATCAcCCACCT TTCTCTAGCC TTTACCCGTC TCTTCAATAT 
TWACATATGC AGAAGTTT C T CCTAACAAAC ACCTGCCTCT GGCTCAG7TC TGCTACCACC 
CTGTTGGTTT CTTTCCCTTC AOAATCAAAT TTAAGAGTGT CAAAAA^VAA^ A3\AAAAAAAC 
TCGA 



600 
660 
720 
780 
784 



<2) I^TF'KATION FOR SEp ZD NO: 102: 

■i) SEQirENCE CHARACTER I STICG : 
30 (A) LENGTH: 10 3 5 base pairs 

(B) TYPE: nucleic acid 
= C ) STRANDED! JES S : double 
:d) TOPOLOGY linear 

35 (xi) 3EQOENCF CEGCFI FT I ON : SEQ ID NO: 102: 

AGA3GCCTG3 CTGC G TTGG ■ 2 CTATC TC 2GT CTCCGCCACC CACTTAGC3T TTTAGGCATC 6 0 

AATTACCAGC AGTTTCTCGG CCA^OTATCTG GAAAATT AC C GGATT03TCC CGGGAGAATA 12 0 

CAAGAGCTT3 AAGAACGCCG CAGTTGC3TG GAAGCCTGCA GAGCAAG03A AGO AGCGTTT 180 

GATGCCGAAT ATC AGCGAAA TCCTC ACAGG GTGGACCTCG ATATTTTAAC CTTTACGATA 24 0 

45 GCTCTGACTG CGTCTGAAGT TATCAACCCT CTGATAGAAG AACTTGGTTG CGATAAGTTT 3 00 

ATCAATAGAG AATAGTTAG3 TG3T3ACACT ACTTCAA 3 A3 AACCTCTGCA TT 3 CAGTCAT 36 0 

^ ACCAATCCTG CAACTTGATT TTOAGAA3TC AAGAGTAPAT CGCGATAAGA CAGTGCACAG 42 0 

GTGGAG3GGA AAAAAAG3GG GAGGG3GAAG CTTATCTTGA AAAAGGATCA GAG AAGT AGA 48 0 

AAAAAATGTC GAAAGGATTA TAA.CTGTAAC GTTCTTT3AG TTTGTGATTG AT3CACATTT 54 0 

55 TTGCCC 3TGC ATTATGGAAA ATCTCTCTCA GO\TTGCTTT ATTACAAAGT AAAGGATGGT 600 

TTTATAAAAT TGAGACTG AT GAAACATCAA TACTAGA3CG CATGAGGATG AAAGAAATTA 660 

TCAAATAGTG GTGAACAGAA TAAGATGTTA ACGCTGAGTT ATTAGGACTG GAAGGOTATG 72 0 

60 
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353 

AAAAGAACTT GAAATTGTCG GAATATGTGC TCTCTTCATG TCATATTCAA TAG AAG TTT C 730 

T AGTTT AAG A TTGATTTTGT GTTTTCTTAG G2ATTTCAAG TGACAAGCAA AGTAAATOTA 34 0 

5 TAT ATT AT G T G AT AAATC AT GTTTTCAAGA A2GTCAAATT TGTGG AGTTT TTTCTTTCAA 900 

TTTTTAATTT TTAAAGTTTT TTTGGTATTA AAAAATCYAT TGACAA ; GGCA AAAAATV*TV\T 960 

WAAVrWTWCN GCGAAAAGCC AAAAAAAAAA AAAAMMAGC-G GGGGCCGGGC CCCATCCCCC 102 0 

0 

caaggcxxttc crorr 1035 



15 

(2) 1 1 1 JF O F.MAT I Of J FOR SEC 1 ID NO: 10.'*: 

; 1 ) S EQUENC E CHARACTER I ST ICS : 

(A) LENGTH. 2 213 base pairs 
20 (B) TYPE: nucleic acid 

( C ) STRANDEDNES S : douo 1 e 

(D) TOPOLO.IY: linear 

(xi) SEOUENCE DESCRIPTION: SEQ ID 1IO: 103: 

25 

AC^GTATTAOj CCCTTTTGTG GGAGCCCCAT GTTTTGTTTT TCTGAGTTGG TGGGGAGGGA 60 
SGGAGGG:;GA G3GOTGAATT GTTTTGCAGA <G3AAGATGGC ATCTGTGCTT TAAATTTCTC 120 

30 attactg:;gt tagaaaacaa agagggaktg ccctgcacat tttcttttgt gcttttaaat 18 o 

GTTTCTTAAG TTGGAACAGG TTTCCTCGGG 2 :TGTTTTGA CTGATTGCTG C^AGTGCATTT 24 0 

gatagttaaa aattactaat tggttttatt tccottcaca ctctgcctcc c:cacttctcc 300 

35 

ccccgttact gaaaaataac cattttagtg tcaggctaga aattgaattg ctgagttttg 3 60 

tgtatccttt aaattaaaaa c2acaagtgt ttattgtagt ggttaaactg TA3CATCTCA 42 0 

40 gcatctgggt ggaagctgcc tatatttctt c c c a gttt aa ctggggacca TCTGTGAAAT 480 

CAVTTTTCCA TCCAGACAGC TGTTGTGAGC AAATGAACAT AAATGCTCGC CGGAAYTTTA 54 0 

C T AACC- .■* iTT TTTATA'TTGA 2CTOCAGTGT A\AAAGCAOA TTTAATTATA AA C AAT AT AT 600 

A 5 

TCAAAATGGO CAAATTTTAT TT r C AAA TGC AGTGTAGA.GC TAGATTAAAA ( 2AC AAC TCTTT 660 

GCCACCTA2T CTGCCCTTTT GGCAAAGTTA CCTTGAACAA AGAATCTTAA GGGTTTATTA 72 0 

50 AGAACTCTTT ATTTTCTTCA TACCCTGTTC TCTG2AGTX: TTTCTAACAG 1 2TTCTG3GTG 730 

CAGATTTTCT TCGGCATCCC TTTGCACTCA GCTTATTACA GGTAGGTAGT GCTTAAGAAA R40 



60 



WO 98/54963 PCT/US98/1 1422 



vS4 



10 



15 



20 



25 



30 



35 



40 



GCAGTTGAAG GCO0AAGGCT CCACTGCATT CTTTGGCTAA GGOCTGAAGG CTTGCTCATC 1140 
TGT AAG ATC T AT ACTGGAGG TTTTGTTTTC CTTTTAAAAT TC TTTAGGG A GAGAGGGATC 12 00 

G7TTCTGAGG GCTTCTGAAA GTATGATTCA ATGTGCAACA TACAGGTAGG TCTTCAGCAT 
AAGCT3AAAT ATATGCATGT AAAAACTTTG ACATCTTTTT TTTTAATTTT CSACTTTCTT 
CTTAA STTTA CTTCTCTTTT TGTCCCCCCC CCATCTTACA GAAGTTGAGO C 2AA'3GGAGA 
ATGGTAOGCA G A G AA( j AAAC A r rGGCAAACT GCTCTGTGCT TTCAAACCAA AGTGTTGC 2C 144 0 

CCAA2CCCAA ATTTGTCTAA 'XACTQXCA GTCTSTTOTG GGCATTGTTT TCTACAACGA 
AATT2TGQGT TTTTrrCTTC r'TTCTTTAAA CATAGAGGTA -AGCA'/AAG G^ATGCCCTA 
ctct . :t . ;o:a GCTCTTGAAA GCATCTGTTT GAG3GAAAGG TCTCTGGGCA AGCAAGT3GT 
TATTTGGATT GCTTG2TTCC CTTTTTCCAC : "TG3GACATT GYAATCATAA AATAACAGTA 
AATT 2 C AAAC CTCAAAAACT ATTATGGCCT < AAGCACAGCT GAAATCTAGC AGAGTTTAAC 
T-TTCT:X:CT CCATGTCTGT CACTTATAAT TCAGGTTCTG CTGTTGGCTT CAGAA2ATGA 



(2) INFORMATION FOR SEQ ID NO: 104: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 51 base pairs 

(B) TYPE nucleic acid 

(C) STRAMDEDNE SS : double 

(D) TOPOLOGY: linear 



55 



1260 
l- ? 20 
1280 



l'JOO 
15 6 0 
162 0 
1680 
1740 



18 00 



GCAGAAGAAT CCTTTTATGC TAGTTATTGC ATTCATGGTT GAAACTCAAC TTAGG3AAAG i8 60 

G2TTCCAATG tattaagcaa tgggctgctt ctccccaatc ctccctaac:a ATTCGTTGTG 19 2 0 

T2GACTTCTC ATCIAAAAGG TTAGTGGCTT TTGCTTG3GA TCAGTGCTCT CTATTGATGT 1930 

TCTTGCT3GT CTCCAGACAC ATTCCTGTTG GATTAAGACT TGAAAGACTT GTAGATGTGT 

GATGTTCAGG CACAGGATGC TGAAAGCTAT GTTACTATTC TTAGTTTGTA AATTG TC C TT 

TTCATACCAT CAT2TTGTTT TCTTTTTGTA GGTATAAATA AAAACACTGT TGACAATAAA 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAA 



2040 
21 30 
2160 
2213 



60 



(xi) SEQUENCE 


DESCRIPTION 


: SEQ ID MO 


: 104 : 






CTT C AC AG AC 


TGACAGAATG 


GTTTTGTTTT 


GTTTTGTTTT 


Qrj-^pryv^Q rrvyvyvp 


GTPTTT3AGA 


60 


TGGACTCTAG 


CTCTGTGACC 


C AGGCTGG AG 


T3CAGT3GTG 


CGATCTCGGC 


TCACT3CAAG 


12 0 


CTCCGCCTCC 


CGjGTTCTCA 


CGATTCTCCT 


G 2CTC AGC CT 


CCCGAGTAG2 


TG jGAC TACA 


180 


GGCGCCCACC 


ACCACGCCCG 


GCTAATTTTT 


T2TATTTTTT 


AGTAGAGACG 


GG3TTTCACC 


240 
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10 



on 



40 



45 



50 



ATGTTAGCCA GGATOGTCTC GATCTCCTGA GGTCGTGATC CGCCCGCYTC GGOCTCCCAA 3 00 

AGTGCTGGGA TTACAGGCGT GAGCCACCGT GCCTGGCCCA GAATGGTTTT TAAAGCCACA 3 60 



GTTGAGARGC CACCCATT3C C0GGC<32CTG 3ACAGTGATC ATCTTG'I'TCA TCTTC 



TTGTTCAG 4 20 



TCCTTTCTTG TGTGATTGGA ATTATTCATC CCOTTTGAAA GATGAGAAGG TTGAGATGCA 

AAGAGTCTAC CTTTCCAAGT TOTCACTGCT ■ 3GAAA3 ARCT AGAAC^CACAG TTCAAAGTTC 

TGGWTTCTGG ACTCTGCAGT O^VGGTYTCr OTTYTCCCAC TTGC2TAOCC TCAATGCCAC 

ACTGTTTTTG AAGTGGCC3A TAACTTGAAG " RAAAGTTT A AAGA 2AGTT2 AATTTAATCA 

15 TCAGRAT3CA TTCmTTT TTTC3GARAC GGAKTTTCAC TCTTGCTGC3 CASGCTGGAG 

tc<:aatc;gto caatgatctc cx;ct2ac7gg aacctatgct tcctg-sgttc aagngattat 

CCAGCCTCAG CCTCCCGAGT AGCTGCGATT ATGGGCGCCC ACCAGCATO: CCAGCTAATT 
TTTGTATTTT TTrTrTAUT AGAGAT 33GG TTTCGCCAGG TTGGGCAm: TGKTCTTGTG 
AAYTCCT03C YTC AO 3 TG AT YTGCCCACYT CATCYTCCAA AAGT ^CTGGG ATTACAGGCA 
25 TGAGGCACTG CGCCTGGC^T CAGAATGCAT TCTTACACAT GTATCGTAGA CATTTATAAG 

CACTCTAATG GATAACAATC CAAGAATAAA TGATTGTAAA AGATGATGCC GAAGAGTTGA 1080 
^ TGTGAATCTT TTTTTCCTAA G AAAAAAAG T CGGGGAGTAT TAAATATTTA OiATCAATGTT 1140 
TATAAAATGA TTACTTTGTA TATCTCATTA TTCCTATTTT GGAATAAAAA OTGACCTTCT 1200 
TTAATCATAT ACTT3TCTTT TGTAAATAGC AGCTTTTGTG TCATTCTCCC '3ACTTTATTA 1260 
35 G1TAATTTAA ATTOSAAAAA AC^CTCAAAC 1 1 AA TATTC TT GTCTGTTCCA GTCTTATAAA 1320 
TAAAACTTAT PvATGCATGTA AAAAAAAAAA A 1351 



(2) INFORMATION FOR 8EQ IO ::• •; 105 

(D gecgence characteristics : 

i A ) LENGT r ! : 2 0 6 6 ba e pair:; 

■. B) TYPE; nucleic acid 

' C) .' "I RA* T EENE33 : double 

ID) TOPOLCGY: linear 

(xi) SEQUENCE DESCRIPTION: SE2 ID NO: 105: 

GGOAC3AGGC GOC3GAGGGC CA :AATCACA G3TCCGGGCA TT3GG-3GAAC 



480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
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ggattctggt ggtgttccaa atcatoggct 

CGACAACGGC AGT GT CCTAC ATGTCGGTGA 

5 

agacaaaatg gttcgtggct t3gggacgca 
agggaattgg aagggaaatt gaagggaatg 
10 ccgacatgga gatgagtcct tggttccaat 
cgttgaaggt aaa-gaaccaa atcagagaaa 
cttaccg r jA tga g ggattt ggt gagt- g 2 a 

15 

AACT'GAAAT' 3 C ACCTTCACA TCTC G CAAG A 
GTGATGT' r :t to gttt cat g ■ 2 AAATT 3G 3 T 
20 TCCGGGTOGC TGTGAATGAG AAGAAGAAAA 
TCC'j^TTOT G3GG-YTCCAG CAAAATOGAG 

g gttcctfgv g or r ga< goat g r rrc atg att a 

25 

TGTCCCGACC C GCAGTOCTT CTGGAAAAAG 
TTATCAATAT CCCAGTGGAA rGGTTTTCCA 
30 TTGGTGACAT CCGACAGGGG ATCTTCTATG 
GT 3 3 GG AG* 2 A CATGATGGAT CAGCAC'GAGG 
T CO GAG G' GAT TGC GGTTG3G TGCTTCT3GG 

35 

TACAACTCAC GAATGCCTTC TAGAGTATGT 
TOGGCTPGAT CATCGTG3GT 3GAATCTGGC 
40 TGGTATTTCA GGTGTTTCGG AACATCAGTG 
AAGTCCOGOG GGTAGACTAT GAGGGGCTAA 
CGTTGOGCT3 CGrTGGCATG ACTGTCATCT 

45 

ATTGGAAATG GGOGG3GGTC ACAGTOGAAG 
GGATGTGGAA TGT< 3TATGTC TTTGCTCTGA 
50 ATGGAGAAGA CCAGTCCAAT GGAATGCAAC 
TGTTTGTTTC GGAACTTTAT CAAGAATTGT 
ACAACGCAGC ttctggtatt TGAGTCAAGA 

55 

GCAGTTGTCA GAGTCAGATT GATTGTAGTT 
TTTATOTOAA AATGTTAAAT ATAAGGAAAA 
60 AAAAAAAAAA AAAAAAAAAA AAAAAA 



356 



TTCTGGTGGG AGGCTTGATT G~TOGAGGGC 3 00 

AATGTGT3GA TGGCCGTAAG A AC' CATCA GA 4 GO 

AT GATTGT 2 A C AA 1 GAT 1 G G G A GAG A' TT GAAG 4 H 0 

ACATG GTGTT TTCTGTTCAC ATTC GCCTGC 54 0 

TCATGCTGTT TATCCTGGAG CTGGAGATTG 60(1 

AT GGA G AA GT C'T G GATG GAG - 2 TTT GCCTGG h h 0 

CTGAAATGGC CGAPGAAAGA GTAGGA GGGA "'GO 

CTCCAGAGGA TGAGGGGG'GT TACTATGAAT '.'HO 

GT 3TGGC G GA TAAGTTTrAC GGTTTAAAGA R40 

TGAATGTG GG AYTTOGGGAG AT AAA 3 GAT A 9 1 'f ■ > 

OG TTC AC GAA G GTGTG G' FTT - G G'GA' rG AA- G A 0 t> 

r T G GT 3TGGT A TT< G GA G GAG G AT' G AC' GAT G~ 1 0 G i. > 

TCATCTTTGG CCTTG3GATT TCCAT GACCT H)H*., 

TGG3GTTTGA CTGGAG GTGG ATGGTGGTGT 114: ■ 

CG AT GCTTCT GTCCTTCTGG ATGATGTTCT 120'^ 

GGAA GC A r AT TGGAGGGTAT TO GAAG GAAG 12 mO 

TCTiGATATT TGACATGTGT -GAGAGAGG3G 13G" 

GGA'GTACAGA CATTOGAACA GAGGTG3CCA IjhO 

TGTGGCTCTA CTTCCTGTTT GTATGGTTCA 144" 

GGAAGCAGTC CAGCCTGCCA 'GCTATGAGGA 150'^ 

TTTTTAGGTT CAA'GTTCCTC ATGGTTATGA IS^O 

TCTTGATCGT TAGTCAGGTA AC GGAA 1 GOG C 162 0 

TGAA GAGTGG CTTTTT'CACA '3GGATCTAT3 lb" 

TGTTCTTGTA TGGACCATCC ■ CAT AAAAACT 174 0 

TCCCATGTAA ATCGAGGGAA GATTGTGCTT 1800 

TCAGGGGTTG GAAATATTCG TTCATCAATG 1860 

AGGCAACACA T3TTTATCAG CTTTGGATTT 192 0 

GTATACGCAC ACAAATACAC TCATTTAOGC 1980 

AAGCGTCAAC AATAAATATT GTTGAGTATA 2 04 0 

2066 
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5 (2) INFORMATION FOR SEQ ID NO: 106: 

■ i ) SEQUENCE CHARACTER ISTICS : 

(A) LENGTH: 17 05 base pairs 
(D) TYPE: nucleic acid 
10 (C) STKATTDEDNESG : double 

( D ) TOPOLOGY : I inear 

( x l ) S EQCEM :E DE3C RZ rT TON : 5 EC I D NO : 10 6: 

15 aattotgcak aga;a:ag2Tg tcgtctcgaa ggaactggtc tgctcacact TGCTGGCTTG 60 

Cf;CAT2ACX2A CTAG7TTTAT CTCCTGACTC ACGGTCtCAAA G2TG7ACTCT GCGAACGTTA 120 

AGTCcGTOCO iAA'^'GCTTGG A\T2CTACGG CCCCCACAGC 03GATCCCCT CAGCCTTCCA 180 

20 

o^tcctgw: tcccgyggac g:tc.aacaat ggoctccatg gtgotacagg taatgggcat 24 o 

CGCG2TOG2G GT--T:TG}3CT G2cto3co2t catoctgtgc TG2G2C;CT.;C CCATGTGGCG 3 00 

25 cgtgacggcc ttcatcgg2a g:aacattgt cacctogcag ac 2atctg2g agggcctatg 3 60 

GAT 3AACTG ' CT >GTG:AGA G2ACCG:vCCA CACCrCAGTGC AA GGTGTACG A:TCG:TGCT 420 

^xtactgccg ca2ga2ct3C aggoagcccg cgtcctcgtc at2atcag2a tcatcgtggc 4&o 

30 

tgct:tggg: gtgoigctst ccgtggtgog ggtcaagtgt accaactgcc t-ggaogatga 54 o 

A AGC 1 2C CAA- i G_C : AAG AC 2 A T 1 2A "PC C ; T GGC GG- G 7 GTGGT- 2 TTC C X 3 XT GG C ■ 7GG2CTTAT b C " 0 

35 GGTGATAGTG C02TGTCCT G^OTGCCOA CAACATCAT 2 CAAGACTTCT AC AAT 2CGCT 6f-0 

GGTGGCCTC2 G7-G2AGAAGC &7*GA3AT-3GG TG2:TCGCT2 TACGTCGG2T G2GC2 2CCTC 720 

CGGNCTG2TG CTCCTTGGCG GGG AOCTGCT TTG2TGCAAC TCTCCACC CC CCACAGACAA 7R0 

40 

GCCTTACTCC GCCAAGTATT CTGCTGCCCG CTCT02TGCT GCCAGCAACT ACGTGTAAGG 84 0 

TGCCACGGCT CCACTCTGTT CCT2TCTGCT TTGTT 2TTC 2 CTGG ACTG AG C" TC AC-2GCAG 900 

45 "V /2GA ' ' "• • V-A ;A V C-O:.; ;T . . 2A : 7 — CCA2T2G2TG OTGAC3AGIG GG:A~CCGGG2 970 

AGAGACT2AG 2C. AGGCAGGA AGGCASCA'SC CTTCAOCCI C TCTGGCCCAC 1 CGGACAACT 132 0 

TCCCAAGGCC GC7T2CTGCT AGCAAGAA2A GA2TCCACCC TCCTCT2GAT ATTGGGGAGG 1080 

50 

GACGGAAGTG ACAO 2GTGTG GTOGTCrOA7T 2GGGAGCT CTTCTG2TGG CCAGGATG3C 114 3 

ri , i;.r^r^ , ''.i c^^-n^o^t^ Tc-r^ r - r ^^ <AOC7TTGG2C ACTGTCCC'"YA 'ITT AC ATTTT 1200 
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CAGGTTT3GG CA3T3GTGC?G GGAG3G3GC3 AGAGAGG3G3 CTCAGGTTGC CC AG 7T CTGT 144 0 

GGC C ' PC A- GO A CTCTrTGCCT CAC2C3CTTC A3C 3CAGGG2 CCCTGGAGAC TGATCCCCT1" 1500 

5 TGAGTCCTCT GCCCCTTCCA AGG A' IT ACT AA TGA 3C 2T3G3 AGGGTGGCAG G3AG3AGGG3 IS 60 

ACAGCTTCAC CCTTG3AAGT CCT3G3GTTT TTCCTCTTCC TTCTTTGT'GG TTTCTGTTTT 162 0 

GTAATTTAAG AAG AG 3T ATT CAT-SACTGTA ATT ATT ATT A TTTTCTACAA TAAATGG3 A Z 16 SO 

10 

CTGTGCAGAG G RAAAAAAAA AAAAG 17^5 



15 

(2) INFORMATION FOR SEQ ID NO; 107: 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1167 ba-e pairs 
20 ( E ) TYPE: nucleic acic 

(C ) STRANDEDNEGS r double 
(L>) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

25 

T3CA<3GAATT CG3CAGAG3T TTT< 2CG 2TAG ACTCTGG2AG TT 33TGAG2A TCAT33CAA2 6 0 

CSTTACAGCS AGAACCAAAG T3CCG3AOAT CCGTGAT3TA AGAA 3GATTG AG2GAATCG3 12 0 

30 T3CC<:ACTCC CACATCC3G3 GACTG 3G3CT G 3ACGATGC- Z TTG3AG3CTC G3CAG3CTTC 130 

G2AAi3GGYTG GT3G3TCAG3 T3G3GG2ACG GOGG3CGG2T GG2GTG3T3C TG3AGAT3AT 2 40 

CCG0GAAG3G AAGAITG:C3 GTCGGG2AGT C 7TTATTG7T GGrSAGSCGS G2A2G3G3AA ' ? , )0 

35 

GACGSCCATC G2CATG33CA TGGOGIAGGO CCTGGGCCCT '3A2A03CCAT T0A'2A3CCAT ;<60 

CG3C3G2AGT GAAATOTTCT CCCTG0AGAT GAG2AJV3ACC GAG3CG3TGA CG7AGGCCTT 42 0 

40 CCGG2GGTCC ATCGGCGTTC GCATCAAGGA G3AGACGGAG ATCATCGAAG GG3AG3T3GT 4 30 

GGAGATCCAG ATT3ATCGAC CAG2AACAG 3 GACGSGrTCC AAG3TGGGCA AACTGACCCT 54 0 

CAAGA02ACA GAG AT 33AGA CCATCT AC G A CCTGGGCA2C AAGATGATTC AKTCCCT3AC 600 

45 

2AAG3ACAAG GTCCAG3CCG >G3GACGT3AT CACCATCGAC AAGG03ACG3 GC AAG\TCT2 66 0 

CAAGCTGOGC CGCT3CTTCA CA0303CCCG CGAACTACGA 03CTATG33C TC 3CAGAC2A 720 

50 AGTTCGTG2A GTGCCCAGAT GGGGAGCTCC AGAAACG2AA G3AG3TGGT3 CACACCGTGT 7R0 

CCCT'3CA'2GA GATCGACGTC ATCAACTCTC GCACCCAGGG '3TTCCTGG2G CTCTTCTCAG 84 0 

GTGA2ACA3G G3AGATCAAG TCA(3AAGTCC GTGA3CAGAT CAATGOCAAG GT 0GCTGAGT o 00 

55 

GG2G2GAGGA G 3GG-\AGGCG GAGATCATCC CTGGAGTGCT GTTCATCGAC GAG 3TCCACA 960 

TGCTOGACAT CGAGAGOTTC TCCTTCCTCA ACCG3GCCCT GGAGAGTGAC ATGGCGCCTG 1020 

60 TCCAGCAi^GT CTAT 0GGG AT GCCGTGAGGG CTCTGGTAGC TG3TGCCCCG GATTOSCGT 3 108 0 
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10 



30 



50 



359 



ATGCCACGGT TGGTGGCCTC GTGCCGAATT CCTGCACCCC GC^GGGATCCA CTAGTT-3TAG 114 0 

AGCGGCCG3C A2CGCGGTGG ANCTC2N 1167 



(2) INFORMATION FCR SEQ ID NO: 108: 



; l * SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1907 base pans 
( 3 ^ TV PE : nu::leic acid 
( C 1 S 2RAI J DEDr JESS : do ub ]. e 
15 ( D ! TOPOLOGY, linear 

ixi) SEQUENCE DESGF. 1 PTION : SEQ ID NO: 108: 

GQ2A2AG-GG AATCATCC7G TGATGTGTGT OCTG2C7TTG TGAGTGTGTG f GAGTGGTGGT 60 

20 

OA3GTGTOAG GTACAGTGTG TTTGATCGTG GTGGGT'I-GAG GGGAACCCTT GTTCAGAGCT 120 

GTGArTGCGG C^3CACTCAG A<2A^GCTOCO CTTG3CTGCT CGTAG2G2CG G3CCTTCTCT 130 

25 CCTC3TCAT3 A/IO'CAOAG""A GCCAGT 3TC :: GGGAGGCAGA AGGTA 2CGG3 GC AGCTACTG 240 

gaggactctg cgggcctg2g tg3gctg2cc cctccggcgt g2Ggccctgt toi;tgctgtc 3 3C 

CATCTATTTC TACTACTGCC TCCCAAATGC GGTCGG2CCG C2CTT2ACTT OG A TGCTTGG 360 

cctcctogg:: gtctc3CAG3 cactgaacat cctgctggsc ctcaa-3G3cc tgggcccagi' 4 20 



35 AT ATT A 1 2 ATG 2*GATAT GT2G G3CTGATCCT GCCAGAGGTG CAGGOCCG3A TTC 3AACTTA 540 

1 2 AATC AGO A T TA> AAAOAA 2C TGGTACG3G3 TGCAGT<3AG3 OA 3CGGCT3T ATATTCTCCT 60 0 

■"CCATTOjAC TGTGG3GT>2 CTGATAACCT GAGTAT-3GCT GA3CCCAACA TTCGCTTCCT 660 

40 

' 3G AT AAAG T G GG'2CAG2AGA CCOGTGACC 3 TGCTGGCATC AA '3< 2A TCGGG TTT AG A GC AA 720 

GAGGATCTAT GA 3CTTC7 >G AGAACGGGCA GCGGGC 3G"JC AGCTCTGTCO TGGAGTAC 3C 7R0 

45 rACCCCCTTG GA3AG ITC' JT 1 TG7 2ATGTC ACAATA 2AGT OAAO0 TG3CC IT A GC f 3GG3A £40 

^GATAGG7TT GA3CA3GC 2 A AACTCTTCTG CCG2ACACTT GiAG-3ACATGG 7G2XGA 3AT GG v'2 0 

CCCTGAGTCT CAGAACAACT GCCG3CTCAT TOCOTA 2GAG GAA\ 2 G TGG AG ATG AG AGO AG 960 

CTTCTC3CTO TCCCAGGAGG TTOTCSGGCA C3T3COG2AG GAG3AAAAG 3 AAG AGS TT AC 102 3 

■;v r;Y -v;' TTGAAGACCT CAG7G7T07C GAGTACCTCC ACGATGTCGG AAGAGCCTGA 1080 
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GGACTT'3ACA TCTTAAGAT5 CGTCTTGT3C CCTTGGGCCA GTCATTTGCC CTCTCTGAOO: 13: 0 

CTOG3TGTCT TCAACCTGTG AAA'PJGGATC ATAATOACTO CCTTACCTCC 0TCACO3TTG lJc.0 

5 TTGTOAGGAC TGAGTOTGTG GAAGTTTTTC ATAAACTTTG GATGCTAGTG TAOTTA'3G3G 144 0 

gtgtg'jgags tgtotttoat g3go3cttoc agaccga3tc cccaccctto t3G3cttcct i3o.i 

TT'^j -G'3G3G ACGGCGAACT CTCTCAATG3 TATOAAGAQ3 GTOCTTCGCO CTGTG3CTGG 1S^0 

10 

tg 3tcatgtt cgattattgg ^jAorcccAG cagaagaytg gaga3gag3a g3ag3o?gag ic:-i) 

tttogg3tat tgaatccccg 03ctccoacc ctg3ag3atc aa3gttg3ta tg3agtctgo 16r-0 

15 tgg;*3G3caa otottgggta atgatgagta tct3tagi;at tctg3gacca '::ttc:ttccc 1740 

to3c :cstta agoota-ggto tgtatcgsca gggggagggg actagagtag toogtctoac lsco 

T , p>::=>TrrTc cttatactcc accccttt3t caacggtcct r rTTTTAAAG3 agatotcaga iRt-.o 

20 

TTAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAG3G GGGCGGC lS'OV 



30 



35 



25 

( 2 > I :-JFO KMAT I G N FOR SEQ ID MO : 10 9: 

( l ) EE GUHNC K 3 H ARACTE RISTICS : 

(A) LHIGTH: 611 base pairs 
(&) TYPE: nucleic acid 
( G ) ST PANT 'FX K*ES S : doubl e 
{ D ) TC POIXGY : linear 

(xi) SEQUENCE DESCRIPTION: 5EQ ID NO: 109: 

ATGAATTAAC CrCOAA03 IT-IT NAAiAG3GAC TCACTATG3G GGAAAGNTG3 GTAAOGCCTG m) 

CAG3TACCGT TCC'3GAATT0 CCG3GTCGAC CCACGCGTCC GATGG3G3TT TAGTAAA.TCA 13 0 

40 GGCTTGCAG3 CTCAAAG3T 3 CAATCTGCCC AGTCTCAG3T ACTGAGACTT TGTG3GCCTC 180 

AGACACCAGG AAGAAAGTTG GGATACAGTC A'TTTGAGTTA AAAA3G3AAT GACC 3CTCAG 24'") 

AAAC G CGCAT TA3CAGTGTT ACTCTTGGAA GTG3 CTTT A 3 TTTTAACG3T CTCTGTTCTG 300 

45 

AAAAAGAG3T GTTTGGTTAO GTGTGA3CCA AGATCACGTT TTGTTAG3TG TGATTTACCT 3 00 

TTGTCCGTTT AAAAGACTT 2 AG3GAGCCAT TCTGTATACA AG3TGTG3TO TTT CG AAT 1 3T 41:0 

50 AGAA 3GGGTT ATGGAAAAG3 GT3CGATCCT TT'GCTGTAAA CTGGAGAGAC CAGTCCCAAA 4 80 

CAGA3GG3AA TTTTAAGCCG TTGTCATCA 3 C C AATT3G AT GTTTTTGGTT AT AGO AAA TT 540 

CCTG3AAAAT AAATAAATAA ATATTTGCAA AACTAAAAAA AAAAAAAAAA AAAAAAAAAA 600 

GGGGGGNCCN C 611 



55 



60 
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(2) INTORMATION FOR SEQ ID NO: 110: 

( i ) SEQiJENCE CHARACTERISTICS : 

(A) LENGTH : 26 32 base pairs 
5 (B) TYPE: nuclei: aci5 

[ C ) STRANPEDN ES G : dcun 1 e 
( D ) TOPOLOGY : 1 1 near 

(xi) SEQUENCE DESCRIPTION: SfrO ID NO: 110: 

10 

TCCCAGCTCT CAGGACAAGG GCCCT<3GGCG AT CTTTT AAA AAAGCCGATT GGGTGTCTTT 6 0 

CTAAAANTAC AACCAGTACT TCATCOTOAA GTTTCTGGGA AG03AGTCCC < ".'TCCAG ATTC 120 

15 TCATGGAGTG ACAAATCTTG ACTCTT03TC CTC^AATTTT TCAGGCCCAA ACTAGCGTTT 130 

CTACAAT3AT TTATTTGCCA AATTTGTCTT GATTATGGGT GOTTGATGAG GAACGTGCTT 240 

TTGTTAG3AA CCGAAACTGG GCGGCGGTGA 'lAGGCGTGTAC GCAATGAGTC ■ 7 GG AAG AGGG 300 

20 

TGAAATC-C TT TCGGTAGGCA CTCCACG3CT GTOAAGATGG CG^IGGCTGC GTGGGTTCAG 3 60 

GTGTTGCCTG TCATTCTTCT GCTTCTGGGA CCTCACCCGT 'GAG 2ACTGTC GTTTTTCAGT 42 0 

25 GCGGGACCGG CAACCGTACC TGCTGCCGAC CGGTCCAAAT ■3GCACATTCC GATACCGTCG 4 80 

GGGAAAAATT ATTTTAGTTT TGGAAAGATC CTCTTCAGAA ATA CC ACT AT CTTC 2TGAAG 540 

TTTGATGGAG AACCTTGTGA CCTGTCTTTG AATATAACCT GGTATCTGAA AAGCGCTGAT 60 0 

30 

TGTTACAATG AAATCTATAA CTT<GAAGGCA GAAGAAGTAG AGTTGTATTT 33 AAAAACTT 660 

AAGGAAAAAA GACO'TTGTC 'TGGGAAATAT CAAACATCAJT CAAAATTG rT T CCAGAACTGO 72 0 

35 AGTGAACTCT TTAAAACACA GACCTTTTCT < .X3 AGATTTT A TGCATCGACT GCCTCTTTTA 7 80 

GGAGAAAAAC AGGA3GCTAA ( >GAG AATGG A AC AAAC CTT A CCTTTATTGG AGACAAAACC P-4 ) 

GCAATGCATG AACCATTGOA AACTTGGCAA GAT3CACCAT ACATTTTTAT TGTACATATT 90 0 

40 

1 3GC ATTTC AT CCTCAAAGGA ATCATCAAAA GAAAATTCAC TGAGTAATCT TTTTACCATG S»6 0 

ACTGTTGAAC TGAAGGGTCC CTATCAATAC CTOACACTTG AAGACTATCC '■ CTTGATG ATT 102 0 

45 TTTTTCATC-G TGATGTGTAT TGTATATCTC 'TGA; rrOGGTG TTCTGTGGCT GGCATGGTCT 10 80 

GCCTOCTACT CX3AGAGAT IT OCTGAGAATT GAG 1TTTGGA TTGGTGCTJT CATCTTCCTG IMC 

Q3AATGOTTG AGAAAGCTGT CTTCTATGCG GAATTTCAGA ATATCCGATA CAAAGGARAA 1200 

50 

TCTGTCCAGG GTGCTTTGAT 7 CTTGC AG AR CTGCTTTCAG CAGTGAAACG CTCACTGGCT 126 0 

CGAA* 2C ' "*TOG TCATCAT AG T O AGTCT' GOG A ' V AT COG AT C G T C A A GC C ACG G G TOG ACT C A 13 2 0 
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TAAAACTTCG GAGGAACATT GTAAAACTCT CTTTGTATCG GCATTTCACC A\CACGCTTA 1560 

TTTTGGCAGT GGCAGCATCC ATTGTGTTTA TCATCTGGAG AAC CATGAAG TTC AG AAT A< 3 162 0 

5 TGACATGTCA GTCGGACTGG CGGGAGCTGT G3GTAGACGA TGCCATCTGG CGCTTGCTGT 168 0 

TCTCCATGAT CCTCTTTGTC ATCAT' 3GTTC TCTGGCGA:C ATCTViCAAAC AACCAGAGGT 174 1 ■ 

TTGGCTTTTC ACCATTGTCT G AG3AAGA 1 3G AGGAG3ATGA ACAAAAGGAG CCTATGCTGA 1900 

10 

AAGAAAGCTT TGAA03AATG AAAATGAGAA GTACCAAAGA AGAACCCAAT GGAAATAGTA 1860 

AAGTTAACAA AGCACAG3AA GAT" 3ATTT 3 A AGTGGGTAGA AG AG AATGTT CCTTCTTCTG 192 c 

15 tgagagatgt A'X'a :ttgca gcccttct^g att:agatga GGAACGAATG ATCACACACT I960 

TTGAAAGGTC CAAAATGGAG TAAGGAATGG GAAGATTTGC AGTTAAAGAT G3CTACCATC 204? ) 

AGGGAAGAGA T'3AG2ATCTG TGTCAGTCTT CTGTACGO:T CCAT03GATT AAAGG AAGCA 2100 

20 

ATGACATCCT GATGTGTTCG TTGATCTTTG G 3C ATTGG AG TTGGCGAGAG GTGTCAGAAC 2160 

AAAGAGAA3A TCTTACT3A\ AV3AAGTTCA T AAG ATGAO A AAAAT 3TACG AG2TTCTTAT 2 22 0 

25 TTACAACACT G2TG2CCCCT TTC 2TCCCAG ACTCTGACAT GGAT3TTCAT G2AACTTAAG 22 80 

TGTGTTGTTC CTGAACTTTC TGT AATGTTT CATTTTTTAA ATCT3ACAAA CT AAAAAG TT 2 34n 

TAA2GTCTTC TAAAAGATTG TCATCAACAC CATAATATGT AATCTCCAGG AGCAACTGCC 2 400 

30 

TGTAATTTTr ATTTATTTAG GGAGTTACAT AGGTG ATGGG GGAAATTGTT AACTACCTTT 2460 

CATTTTCCTG G3AAGTCAAG GTT AC ATCTT GCAGAGGTTG TTTTG AGAAA AAAGGGCCCT 2 520 

35 TC TG AGTTAA GGAGCCATAG TTCTATCAAT GATCAAAAGA AAAAAAAAAA AACTC 3ATCG 2 58 0 

GCACGAGGGG GGGCCCGGTA C CC AATTCGC CCTATGGGAN TCGAATGAGA CC 2 63 2 

40 

(2) INFORMATION FOR SEQ ID NO : 111: 

( i ) SEQUENCE CHARACTER I ST ICS : 
45 (A) LENGTH: 2249 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDrjESS : double 

(D) TOPOLOGY: linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

G AATTCGGC A CGAGCTCACC GTGCTGCGTG AC ACAAGGC C AGCCTGCGCC TACGAGCCCA 60 

TGGACTTTKT RATGGCCCTC ATCTACGACA TGGTACTGSW TGTGGTCACC CTG3GGCTGG 12 0 

55 

CCCTCTTCAC TCTGTGCGGC AAGTTCAAGA GGTGGAAGCT GAACGGGGCC TTCCTCCTCA 130 

TCACAGCCTT CCTCTCTGTG CTCATCTGGG TGGCCTGGAT GACCATGTAC CTCTTCGGCA 240 

60 ATGTCAAGCT GCAGCAGGGG GATGCCTGGA ACGACCCCAC CTTGGCCATC ACGCTGGCGG 3 00 
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CCAGCGCTGG GTCTTCGTCA TCTTCCACGC CATCCCTGAG ATCCACTGCA CCCTTCTGCC 36C 

AGCCCTGCAG GAGAAZAOZC CCAACTACTT CGACACGTCG CAOZCCAGGA TGCGGGAGAC 420 

5 

GOZCTTCGAG GAGGACGTGC AZCTOZOZCG GZOZTATATG "3A< 3 AA ! Z AAGG CCTTCTCCAT 4 30 

GZATGAAZAC AATGCAtZCTC TCOZAACAGC AZGATTTCCC AACZGCAGCT TZGZA^AAAG 5 40 

10 ACCCAGTZGC AGCTTZGGGA AAAGAGGGAG CGCTCCGTTT AGAAGCAACG T3TATCAGCC 600 

AAC TGAGAT Z GCCGTCGTZC T2AACGGTQZ GACZATCCCA ACTZCTCZZO CAAGTCAZAC 6 60 

AZGAAGAMAC CTTTGGTGAA AGAZTTTAA3 TTCCAZAGAA TCAGAATTTC TZTTACCGAT 72 0 

15 

ITGCCTCCCT GGCTGTGTCT TTCTTGAGZG AG AAATCGGT AACAGTTZCC GAACCAGZCZ 7 30 

G'ZCTCACAGG C AGG AAATTT GZAAATCCTA GCCAAGGGGA TTTCGTGTAA ATGT* Z AAG A Z 84 0 

20 TGAGGAACTG AAAA'GCTAAC ACCGACTOZC CGCCCCTCCC CTGCCACIACA CACAGACACG 900 

T AATAC C AG A OZAACCTCAA TCCCCGCAAA CTAAAGCAAA GCTAATTGCA AATAGTATTA 960 

GGCTCACT'ZG AAAATGTGZC TZGGAAGACT GTTTCATCCT CTGGGGC;TAG AACAGAACCA 1020 

25 

AATT'ZACAGC TZGT3GGCCA GACTGGTGTT GGTTGGAGGT COZGGGCTCC CAZTZTTATC 1080 

ACCTCTCCCC AGCAAGTG'ZT GGACCCCAGG TA3CCTCTTG GAGATGAOZG TTZCGTTGAG 1140 

30 GACAAATGGG GACTTTGCGA CCGGCTTTOZ CTZGTGGTTT GCACATTTCA GOGGZGTCAG 1200 

GAGAGTTAAG GAGGTTGT : ZG GTZGGATTC Z AAZGTGAGGC CCAACTGAAT ZGTZGZGTGA 126) 

GCTTTATAGC CAZTAGAGGT GGAGGGACCC 1XZGCATGTZC CAAAGAAGAG GCZCTCTGGC 1320 

35 

TGATGAAGTG ACCATCACAT TTZGAAAGTG AT C AAC CAC T GTTCCTTCTA TSGGGCTCTT 1380 

GCTCTAGTGT CTATGGTGAG AACACAGGCC CC3CCCCTTC CCTTGTAGA Z GGATAGAAAT 144 0 

40 ATTCTGGCTT GGGGCAGCAG TC ZCTTCTTC CCTTGATCAT CTOZCCCTCT TZCTAZACTT 15 CO 

ACQZGTGTAT CTZCAAATZC TCTCCCAATT TTATTCCCTT ATTCATTTCA AGAGZTCCAA 15 CO 

TGGGGTTrC C AGCTG AA AT J S C ' ZCT ZCGGG A < 'X ^Z AG ZTTGG AAGOZ A 1 ZGC A G Z A Z ZGC AGG 16 2) 

45 

TTTTCC'ZCGA TGATGTCACC TAZCAGGZCT TCAGGGZTTC CC ACT AGG AT GZAZAGATGA 1680 

CCTCTOZCTG G CTCAC AAGC AGTG ACACC T CGGGT ZCTTT CCGTTGCTAT GZTGAAAATT 17 4 0 

50 CCTZGAT2GA ATG GATC AC A TGAZGGTTTC T r 2GTT ZCTTT TGGAGGGTZT G ZG-GG AT ATT 1800 

ZTGTTTTGGT TTTTCTGCAG GTTCCATGAA AACAGZCCTT TTZCAAZCCC ATTGTTTCTG 186 0 
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AAGAAAAA-GG AAAAA. GGGGA GTGTGGGG-GG- 2-G G GGGGGAG GACTGACCGC TTCATAAGCC 2160 
AGTACGTCTG A GGGGA, GG AT GGGG G AAG2 AG. AGCTTTTGAT ATTTG TC AAA AAAAAAAAAA 2220 
AAAAANCCCS C-GGGG- GGGG G CGG^GGGGG 2 24 9 



(2) IMFC?-MA.TICri ?C?. SE^ 22 :7G : 112: 

(A) GZNGGG-:: 2133 case pairs 
TYPr. : r.uc lei 2 acid 

(XI) 3GCUF_NCZ CG3G7G PTICi I : 32lQ ID NO: 112: 



GAT ACT AT AA GG' 2 A_A' 2 TGAC GCA2GGGGGC C-2GGTTAGAC TAGTG2ATCG 2GG2T-GCAGG 60 

GANCOGAAAA TGATGAAA.GT '2 AG 2GTGAAG ACCCCGAAGA AAAGGAG3AA TTCG2CGTG2 180 

CCGAGAATAG GTG G GTCGAG 2AGGGTAAGG AAjGAAATGT C TAAA 2 2TTTT AA\ATCA2ATA 240 

CTGACGAAGT TGTGGTGA T A GTGG- 2TG2LAA AA.ATTTTGAA AGA TC AAG AT A2 2TTGAGTC 300 

A -GG ATGGAA T GGA.G-GATGGA GTG ACTGGGC AG CTTG T* 2 AT TAAAA 2AGAA AAIAGGGCTC 360 

A. 33ATCA.GGG A GGGG AGGAA. ACAAATACAG G2GGAAG2AA TGTTA 2TACA T2AT2AACT2 42 0 

CTAATAGTAA GTCTAGAGGT G-GGG GTGGT A CTAGCAAC CC TTTT G 2TTTA G3TG32CTTG 480 

G-3GGA.CTGGG AG2GGTGAGT AGCGGGGGTT TG-AATAGTAC CAACTTCTCT '2LA^ 2TA'2AGA 54 0 

GTCAGATGCA GG' GAG2AACGT GTG7CTAACC GTGAAATGAT GGTCCAGAT2 AT G G AAAAWC 600 

CCYTTGTTCA G_A2<2_ATOGGC . "GAAATC GT -GAG CTG ATGN AGA2AGTTAA TTATGG2CAA 660 

TCCAGAAATG CAGG2AG TGGA TACAGAGAAA TCCCAGAAAT TAGTCATATG TTGAATAATC 720 

CAGATATAAT GAGACAAACG TTC-GAAGTTG C 2 GAG 2 AATC CAG2AATGAT G2AG2AGATG 780 

ATGAGGAACG AG 2ACCGAGG TTGGA GGAA.C CTAGAAAGCA TCCGAGGG2G ATATAATGCT 840 

TTAA2GCGCA TGTACACAGA T.ATG 'GAGGAA CGAATG2TGA GTG2TGGA2A AGAG2AGTTT 900 

GGTGGT AAT 1 2 CATTTGCGTC CTTGGTGA-GG AATAC ATC CT CTG2TGAAG3 TAGTCAACCT 960 

TCGCGTACAG AAAATAGAGA TCCAGTACCC AATCCATGGG CTCCACAGAC TTCCCAGAGT 102 0 

TCATCAGCTT CCAGGGGCAC TGGCAGCA'GG GTGGGTOGCA CTACTGGTAG TACTGCCAGT 1080 

GGCACTTCTG GGGAJGAGGAC TAGGGCGCGA AATTTG2TGC CTG2AGTAGG AG2TAGTATG 1140 

TTCAACACAC CAGGAATGCA GAJGCTTGTTG CAACAAATAA CTGAAAACCC ACAAGTTATG 1200 
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CAAAACATGT TGTCTOCCCC CTACATGAGA AGCATGATGC AGTCACTAAG GC AGAATC CT 1260 

GACCTTGCTG C AC AG AT 3 AT i3CTGAATAAT CCCCTATTTG CTGGAAATCC TCAGCTTCAA 132 0 

5 G AAC AAATG A GACAACA3CT CCCAACTTTC CTCCAACAAA T3CAGAATCC TGATACACTA 13 P0 

TCAGCAATGT CAAACCCTAG A3CAATGCAG <3CCTTGTTAC AGATTCAGCA GGGTTTACAG 1440 

A2ATTAGCAA CGGAAQ-v* T 33GCCTCATC CCAGGGTTTA CTCCT33CTT GGGGGCATTA 1500 

10 

G3AA3CACTG GA3GCTCTTC GGGAACTAAT G3ATCTAACG COACACCTAG TGAAAACACA 1560 

AGTCCC AC AG CAGGAA2CAC T jAACCTSG A CATCAGCAGT TTATTCAGCA GATGC TGCAG 1620 

15 GCTCTTGCTG GAGTAAATGC TCAGCTACAG AATCCAGAAG TCAGATTTCA GCAACAACTG 16 80 

GAAGAACTCA GTGCAATOG3 ATTTTTGAAC CGTGAAGCAA ACTTGCAAGC TCTAATAGCA 17 4 0 

*rj»iG*nr,Tn aTATrAATV AGCTATTGAA AGGTTACTGG G2TCCCAGCC ATCATAGCAG 1800 

20 

CATTTCTGTA TC TK G A AAAA ATGTAATTTA 'IT TTIGATM C3GCTCTTAA ACTTTAAAA T I860 

ACCTGCTTTA TTTCATTTT3 ACTCTT3GAA TTCTGTGCTG IT AT A AAC AA ACCCAATAT3 192 0 

25 ATGCATTTTA AGGTGGAGTA CAGTAAGATG TGTGGGTTTT TCTGTATTTT TC TTTTCTGG 1930 

AACAGTGGGA Al'TAAGGCTA I'TGCATGCAT CACTTCTGCA 7TTATTGTAA TTTTTTAAAA 204 0 

ACATCACCTT TTATAGTTOG GTGACCAGAT TTTGTCCTGC ATCTGTC C AG TTTATTTGCT 2100 

30 

TTTTAAACAT TAGCCTATGG TAG T AATTT A TGTAGAATAA AAGCATTAAA AAGAAGCAAA 2160 

AAAAAAAAAA AAAAATTCCT GCGCCCGCGA ATTCTTCT 219S 

35 

(2) INFORMATION FOR SEQ ID NO: 113- 

40 (l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 43 fcase pairs 

(B) TYPE: nucleic acid 

( C ) 5TEANTEDNESS : doubl e 
{ O ) TOPOLOGY : 1 i nea r 

45 

{ X : ) S E0;NEMCE ~FS. CR I PT I ON : SEQ ID MO: 113: 
CTGAAGT3TA TGT33T3A3G AAGAAGAGGC TCCTACTGTA GACAG2CTT3 TT C TAG AG AT 6C 
50 CCTCCCAGAA APCTCT303C CA3GTG3AAC CCAGGGTCAG AGAG3GATGG -3AGAGAG3TT 12 C 

TAATTTTCCA TGATAAATAA AAATCTATAA AATAATAAAC AAGA 3 AAAAG AG ATTG 3 AAA 13C 
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rgactt< sgat 
aagcaagttc 

5 

cttccagatc 
c2acgtagac 
1 0 aaggattgtg 
tca<3gco3gc 
gt02accgtg 

15 

CCTACTG^CT 
AGCCCCA3GA 
20 T3GTTCCTGT 
AAAAAAAAAA 

25 

(2) INFORMATION FOR SEQ ID NO: 114: 

(l) SEQUENCE CHARACTERISTICS : 
30 (A) LENGTH: 7 03 base pairs 

( B) TYPE : nucleic acid 

( C ) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 (xi) SEQUENC E DESCRI PTION : SEQ ID NO : 114: 

GAATTO 3GC A CGAGTGCGCG GGCACCAC3G 03GTTTTTCG ACGCTGG03G T3GACGCAGG 60 

CAGCAT3GAC CACGGTTGCT GGGCGGAT<3G GGAGCGTCTA TGGTC AGTT 3 C2TTAGAAGT 120 

40 

GGTGAGATGG GAAGCTGCAG TTGG AAGAC C CTGGAGGATG CCT<3ACAA3G O 3 ATGTCTGA loO 

C ACATG ATTG GAGCTCTTTT TGAAATGTTT CTTGCCCTTC CTGGAGCA3A GG AGC CATTA 240 

45 TTTATGCAGG TACATCGAAG TCTTTTGACC T02ATACAGT GATTATGCTT GTCATC'GCTG 3 00 

GTGGTATCCT '3GCGGCCTTG CTCCT3CTGA TAGTTGTCGT GCTCTGTCTT TACTTCAAAA 3 60 

TACACAAC3C 1 3CT AAAAGCT GCAAAGGAAC CTGAAGCTGT '3GCTGTAAAA AATCACAACC 420 

50 

C AG ACAAG3T GTGGTGGGCC AAGAACAGCC A3GCCAAAAC CATTGCCA23 GAGTCTTGTC 430 

CTGCCCTGCA GTGCTGTGAA GGATATAGAA TGTGT 1 3CCAG TTTTGATTGC CTGCCACCTT 54 0 

55 GCTGTT3CGA CATAAATGAG GGC CTCTG AG TTAGGAAAGG TGG3CACAAA AATCTTCATG 600 

AGCAATACTT CTTAGTAGAT TGTTTTGTTA TTCAAATCAA GTTCTAGT3T TTTTATGTGA 660 

GATTATATAA TTTACAGTGT TGTTTTATAT ACTTTTGAAT AAA 703 

60 



GGGTTTGAGG GTTACTCCCT GAGTGACTGG 
AACATATCAA AGATWAATG A AAAT3CAGAT 
AACAGCCACT ACTGGTGCAA CRATTATAAG 
TGTCAAGATC T GC T 3 AATCC CAACCTTCTT 
TCCGGAGCAC GG3G3ATGAA CAACTG03TT 
CACTCTTCTA CTGGCT'3ACA '3GATGCCGCC 
GARTCATTCC AAGACTCCTG TCCTCACTCA 
CCACTTCATG TTATTTTCTT CCCTTCCCAT 
ATAAATGGTT TTCTTG3CTT CCTCCTTA3T 
CTGTTATTTG TAAACT-3A3G ACCACAAIAA 
AAAAAAAACT CGA 



CTGTGCCTGG CTTTTGTGGA 4 80 

G3AAGCTTTG ACTATGGSCT 540 

AGTTACTCGG AAAACCTTTG 600 

GCAGGCATOC ACTGCGCAAA 660 

AGAATGGAAG KTTGC ACTGT 720 

TGAGATKAAA CARGGTGCGG 780 

R3GATTCTTO ATTTCTTC TT 340 

TTACAACTAA AACTGACCAG 000 

CCCATCTQ l-\ CCCAGTCC 3C 3 CO 

AGAAATCTTT ATATTTATCG 102 0 

104 3 



WO 98/54963 PCT/US98/1 1422 



367 



10 



15 



25 



35 



40 



45 



50 



(2) INFORMATION FOR SEQ ID NO: 115: 

:i) SEQUENCE CHARACTERISTICS: 

(A.) LENGTH: 36 84 base pairs 
{B> TYPE: nucleic acid 

(C) STRANDED! 1 JESS : double 

( D ) TO POLOGY ■ 1 1 near 



txi) SEQUENCE DESCRJ PTION : SEQ ID NO: lib: 

GGCAGAGGGG <3CATGAGCAG GAGGAGGATT ACCGCTACGA GGTGCTCA3G GCCGAGCAGA -50 

TTCTACAACA CATGGTGGNA ATGTATCCG3 GAGGT C AACO AGGTCATC 2A GAATC CAG ZA 120 

ACTATC AC AA GAATACTCCT TAG3CACTTC AATTGGGATA AAG AG AAGCT AATGGAAAGG 130 

TACTTTGATG G AAACCTG3 A GAA3CT'2TTT GCTGAGTGTC ATGTAATTAA TC C AAGTAAA 240 

AAGTCTCGAA CACGCCAGAT G AAT AC AA GG TCATCAGCAC AQ3ATATG3C TTGTCAGATC 300 

TGCTACTTGA ACTACCCTAA CTCGTATTTC ACTGGCCTTG AATGTGGACA TAAGTTTTGT 360 

AT3CAGT3CT GGAGTGAATA TTTAACTACC AAAATAATG3 AAGAAGGCAT GQ3TCAGACT 420 

ATTTCGTGTC CTG3TCATG3 TTGTGATATC TTAGTGGAT3 AC AACACA- 3T TATGCG2CT3 480 

30 AT2ACAGATT CAA.AAGTT.aLA ATTAAAGTAT CAG3ATTT>V\ T AACAAAT AG ■3TTT3TAGAG 54 0 

TG2AATCGAC TGTT A AAGTG GT3TC 2TGCC CCA3ATTG3C ACCATGTTGT TAAAGTCCAA 600 

TATCCTGAT3 C TAAAC C TG T TC3CT3CAAA TGTGGGCGCC AATTTTGC T F TAACTGTG3A 660 

GAAAATTG3C ATGATCCTGT T AAAT 3TAAG TGGTTAAAGA AATGGATTA^ AAAGTGTGAT 72 0 

GATGACAGTG AAACCTCCAA TT3GATT3CA GCCAACACAA AGGAATGTCC -CAAATO2CAT 780 

GTCACAATTG AGAAGG AT' 3G TGGTTGTAAT CACATGGTCT GTCGTAACCA GAATTGTAAA 84 0 

■GCAGAGTTTT GCTG3GTGTG TCTTG3CCCA TG3<3AACCA ; AT'GGATCT' 3C -3TGGTACAA3 900 

TGTAACCG2T ATAATGAG3A TGATG2AAA3 GCAGCAAGAG ATGCACAG3A 3CGATCTAG3 960 

G2AGCCCTGC AGAG3TACCT G1TCTACTGT AATCGCTATA TGAACCACAT '3CAGAGCCTG 1020 

CGCTTTGAG2 ACAAACTATA T3CTCAGGTG AAACAGAAAA TGGAGGAGAT <3CAG 2AGCAC 108 0 

AACATGTCCT GGATTGAG3T G2AGTTCCTG AAGAAGGCAG TTGATGTCCT CTGCCAGTGT 114 0 

CGTGCCACAC T CATGT AC AC TTATGTCTTC GCTTTCTACC TCAAAAAGAA TAACCAGTCC 1200 



60 AAAGATCTGT 



GTACAT TGAG?GAv3TGA GAA'i v - -C .VT GCATAAAATG AACTCTGAAA 144C 
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ACTTTACCAT CTAGAGTGCT CATGCAATTA 
AAGCCTATTC TGACACCACT GGTCTGTAGT 

5 

AGTAAATT AT ATTGTAATAA AAAGGTAGAT 
CAAC AAAAG T G TG AC A- 3 AC A CACTAAAAGC 
1 0 TTCTCAAA 3C TGACTC 2TTT TTTTTCTTTT 
TTCAAACAGC TCCTTGACAC TGCTTTTCAT 
TAAAGGACCT CTTCCCCTTC CTCPOTACA 

15 

CTCTTTCTCT CATACCCCAA GGTCATGAGT 
CTTCJGGATSG GGAAAGG-3GT AGGCAGCAAG 
20 TTGTATATGA CTTTTAAAAC AAGAGGACAA 
ATAT3CAT3G ACAAAGO\AG CGTGGCACGT 
TTTATTCTTT AAAAATCTTC AAGATTATGT 

25 

TTATCTTTCC CGGGTTGGGG TTGGGATAAA 

tatctagagc tctttcactt tcctgaggtt 
30 ttgccggcca tgggctncay gccttgaatt 
AGCTCTGArT aatgccatga cctgattaag 
tcatgaatta acctgtccca ACCTAATCCC 

35 

TTTGTGTGCC CATGTTATGC ACACAGCTGT 
TGGAAGCATT TTAAAATTTC TTTTACTTTT 
40 TTTAGTTATG GCTCGTCTGC TCACCCCTTC 
GCTTTAAATA TAAATAGCCA TTCATTTAGT 
CGAAATTGCT TGTGTGTGTT GCTGTGGGTT 

45 

ATATTCCCAT CTAGTGCATT TAAATAGAAA 
CAGATAGGAG AATTAATAAT GCATTTTAGC 
50 AAGAGCTATG TTAAAAGTAA AGGATGGTGG 
CATTCTGGCT GTGGTATTTT TCAATAGGTC 
GAGTGCAGAG TGAACTAGGC AACTAGATTA 

55 

CTGAGGACCT CTTCGTCTTC CTTTAAATGT 
AGGCAGCTTT GTCTGCTCTT ACACTGTACA 
60 TTTCCTCTGG CCTTTTGCCT AAGTTAGGCT 



AAACAAAACA AACACAAACA AGGAGGCACT 1500 

ACCAGAATTG TTTTGTTAAT GGAAAGTTTA 1560 

AAAC C ATTGT ACAACAGTAT TCTAOGCCGC 162 0 

CCTCCAACTT TAACTTGTAA CGTAGCTTCA 158 0 

TCCTTTTCCT GAGTGTAGTA 1 2AGTT AAAAT 1740 

GTTCAAAC C A GCCATTTTGT TGTACTTTGG 1800 

'CATACAGATA CACCCACACA < CAG ACTG ACT 136 0 

■3AATGATGCT TAGTTCCTTG TAAAGAAAAT 192 0 

AGGATTCAAC AAACGAAAAA CATAAAAACT 1980 

CACAGTATTT TTCAAAATTG TATATAGCGC 204 0 

GTTTGCATAA TCTTTAATTA CAAAAAAATA 2100 

CTATTTGCTG TGCATTTTCT TTCAGTTTGC 2160 

GGTGTGTCGG TTTAGCACCT CTGGAAGACC 222 0 

ATTTTGCCCY TTCTGGTGTT GGTATGTCTG 2 280 

CCTGCTCTTG ATC AGGGACA AGGGAGGTCA 2340 

GGGTAsCAGC A GGGAGTTTTG TTGCTACAGC 2-100 

CCTCCATGG-C ATCATGCCTC TACCCAAGCC 2 46 0 

AGGCATTCTT AAGTCCCCT3 TCGCATCCAG 2 520 

TGGTTTTCCC TT AATTGCT 3 CTTTTCAGAT 2 580 

TCT AC ATT AG GGTGTCAAAG AGAATGTTTT 2 640 

CTCAGATTGT GAATTTAAAA TGGTGGATAC 2700 

TGGTTTGAAG GCAAACACCC CTAGAACATG 2760 

TCACTGAGTT TGCTGCTTTT TTATTGTCAG 2 82 0 

TGTGATGTCC ATTTTTATGA AATTCC TACT 2 880 

TGGTTGTATT AACTATATAC CTGTTTAGGC 2 94 0 

AGCATCTGTA AATCTGTCAG TTTTATACAG 3000 

AG AGGTCT AA ATATGAAATA CCAGTTGAGG 3060 

CTTTTGCCTA GGGAGTGTTT AC C ATTTGTG 312 0 

TCCTATTACT CCATTGGGAA GTAGGTTCAC 3180 

TTGCTGAATC AACCCTACTT TTCCTTTTAG 3 2 40 
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AAAAGGTTGT TACA3GA3AT TTACT33CAA CT3TTCTTTT CCCATCAAAA ATCAGT3AAT 33 00 

GTTT3CTGAG TATAAAT3CT OrTTCCTTAA ACCACTTGTC GCTTTAGGAT CAACTTTA3C 3 3 60 

TGTACCTTTT CTCCTTT3CT CCCTTOOCAC CTCA3GTGCA AATCTGAACT CAGTGTCTGC 34 20 

TTCTTCCATT TTCTCGTOTC TCTCCCCTCT TCCCCCATTA TCCATATGAC ATTATTTTAC 34 8 0 

TTCAAATGAC AGCATCAATC TTAAAAAGAT ATACATTAAA ACTAAGGAGT TTTTTTAAAG 3 54 0 



AAAGCCTGAA TAAGTTCCTT TCC2TGGTAA CTTTGAAAAG CAGTCAGAGT TGC TAT AT AG 
ATATATGTCX3 CTCCTTTAAA AT03TTTGTG T ATGTG TGGT GTTTAAAAAA AAAAAAAAAA 

TTCGGGGG<JG GGCCCG3TNC CCAT 



3 600 
3 660 
3684 



( 2 ) INTO EKAT T ON FOE SEQ ID NO : lib: 

{ ^ ) SEQUENCE CHARACTERISTICS - 

(A) LENGTH: 1965 base pairs 
(3) TYPE: nucleic acid 
( C) 3TRANDEDNESS : double 
: D } TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

AAGAAAGGCT ATTAAAATTC TAG AT C AC A T ATGG AC CCGG GAAGGTTTTT NACCCTCTCT 60 

TAGTGACATC GA0TCT2CCA CTAGACAAAA TAGGTGGAAA AATCTCTCGA OGGCTCACAT 12 0 

TGTTTTGTC A TCTTCA33AA AAACACCACC AGG CCAT AC C ACAGCCTGCC CAGTGAGGC3 1R0 

GTCTTTGCCA ACAG3AC 2GG GATGCTGGTG GTGGCCTTTG GGCTGCTGGT GOTCTAEATC 24 0 

CTTCTOG 3T T CAT3TTG?AA GCGCCCAGAG CCGGGGATCC TGACCGACAG ACA3GCCCTO 300 

CTGCATGATG G3GAGT3AAG C A : GCAGGAA' 3 GGGCTCCCAA GAGCTCCTGG TGGTGCA3CC 360 

TGTGCTCCC3 TCAGAA3CTC TGCTCTTCCC AGGGCTCC CG GCT3GTTTCA GC AG 3CGACT 42 0 

TTCTTCCAAT GCTGGC3C3A GACTTCTT3C CTG03TGCTG GC3TGCCCTC TCCGJN30G3 4?0 

ttgct3CCT3 ?arrr::c ttggtgcytt tgctgg3TGG tggg:: civk/j ctctccggcc 540 

GCTTGCTGCC TGT'CT G3TTT CCTTGGTGGC TTTGCTGGGT G3T3GGCCTG CCTTCTCTGG 60 0 

CTGCTTGCTG CCTGT'TTGCT TTC CTTO 3TG GCTTTGGCTT CP3CACTCCT TGGCGTCAS3 66 0 

TCTCAGGTC 3 TCCATTCACA CGAGG TCCTC CTC:;CTCTGG CCG3TCTTOC TCCTCCTGTC 72 0 
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CAGCCZCATC T3ZATGTGAG GT0333TGGA 

g?:jG: zgc re a ^ gzag zttg tgztg agzag 

5 TGZTG ZTCTG TGTCTGZATG TATTGTGACC 
CAGCTGAG3Z CTG3ATCZZG GCCTTTCCCT 
CCCTACAAAT ZCTGGTGACC rOZTCTCOZA 

10 

AGZGATGA jT AACAGAGZTG GCTGTGZACT 
TTCCTGZCTC C ZGCTOZOZA GZTZTGZGZT 
15 ACZAZTCTGC AZCTZACCAT GZAGZTGATZ 

CCrOZGAGZC ACTGZCACZT T ZAGAGZC-GT 

gc z z za re at goztsctt zo ctgzoc zaa z 

20 

c r zttzgt zt t ztt zagz-zg a z> jaaata zg 

TOTTOZTOCT GTOZZ'rTT'rT TGZCTTCTCC 
25 ACTGZTGT ZO TZAGTAAZCA AGTGAGAA ZC 

AGZATCAOZVa TCCTACTCCT GZCAACATCA 
A ZATGTT3 Z- Z A iAAZGGZTA ATATTGAG ZG 

30 

gzgatgtgzt czagzgazzt ggztgozatg 
tttcactgzt ' r fa' p zaa zc t ozagtttcat 
3 5 ' i" n ; ag z zt- z : t at a a< : aaaa at at a^t a z : 
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GACATCATGZ ZGTZATTOZA GAAAGG3GGA 061 

7TGAZCGCTZ TGAZCTGTTC TOT/ITC 3 PAT 1 G 

GTGCG3CTZC ACCTGTTGCA OCTOCTG2TA 1080 

GTG A ZTT A Z Z TOT ZTGT TAG C GZO ANG2 AG 1 1 4 ■") 

AG AA ZAGA.Gr GTGTGCGZAG AIGTZC ZAGT 12 00 

TCZTCTACTT CTCCTTOZTG GATCAOG3CC 12 60 

TGZTCTCTTG CXZAGZOZ ZOZ A3CZ Z ZTCTZ 132'!) 

G ZAAAGTTGT 1 ZGTGT Z ZAGT CTGC AGZA< ZZ 1 ; ; 3 "i 

TCZTTGCTGA GACCCACATT GZTTGA ZCT3 K,4'J 

CTAGZGTTGT GTGC ZATGGT AZAOZTTGAZ 1v2m 

GT Z-ZAGAGZ Z G3AAG ZGTGT T3CTZCTAAG 106' i 

AAA 1 ZACGZ A- Z TOO ZA 1 ZGT< Z 0 C AAOCTT Z A Z 1 ^ 2 0 

GT 33ZGTTTZ GAZOCZA.ZCT A ZTCTCT33Z IMi"' 

GGZCAAC3TZ ■ZAZCCCAGZC TCACATTOZC 174i 

TGT'TZAGTGZ CTGZAGZCTT CAAAZO ZAGT IfiQc 

AZCAZCTCC Z CGTCTCCATA GZGGFAGGZA lfloi- 

TAAATATGTT AAGAAT'ZAAA GZTG TCTTTG I 1 .' 1 2 0 

C 1 Z- 3 ZTG 3 ZT T AAA' ^ 1 " 6 ! ■ 



40 (2) INFORMATION FOR SEQ ID NO: 117: 



45 



50 



55 



(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 5 03 base pairs 
(E) TYPE: nucleic acid 
( C ) STRANDEE.-NESS : dcubl e 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

AGTGATCOZC TTOCCTC'ZGC CTCCCAAAAT GZT'ZGAATTG TAAGCGTGGG CCTCTGOACC 60 

OZGCCTGGTO C 1 3C A ATTT AA AAA03CACAG CC ACCATTCC CTYTCCAGAA AGCACOZAGA 120 

TGGCTTTGZG AGAAC CAGC C TCCTCCATG3 A3GAAAGZTT G3GATCTZCC TTCCCACCTG 130 

GGGAGGAGAG GZATCTGTGG AAAATCCTTC TGACGGACTT CCCCTCAGTG CCTGATCCAT 24 0 

ACTCAATAGT AGAAAAAGTA AGAAATATAG AAAGATAGCA GATACAG3GA GACAGTTCCC 3 00 

60 CAAATAGZTG AGOG AWT AGO GZAGAAGCAA TATTGAAGAC CTAATAGOTG AGACATTTCC 3 60 



PAJSDOCID <WO ^*54963A2 
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AGAACTGATA AAGT3CATCC AGO CACAO AT CAAGCAGCCC AGAAAATTCC AGSCAGCATC 42 0 

AACAAATAAA TAGCCCCACA TG3ACCCGTG AAAATGCAGA AGACCAAACA AAAAAGTCO 3 4 80 

G TC AA 1 3 AGO C AG A< 3T P AAA* 3 AG J 5 03 



(2) INFORMATION FOR 3EQ ID NO: 113: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1133 base pairs 
15 (B) TYPE: nucleic acid 

C C ) S T RANT'E E >NE £ S : d c ub 1 e 
(D) TCPOLCGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

GGCACAG3TT GGAATGAACC CCTGT'GGATA AGO 3GG ACTA T r A< 3 A T AG A\ TAAAC AT< 3 A A 6 0 

TAAATGCTTG ATGAATAAA 3 GOTAATC 3TA CCTTCCCAG3 CTGACACCTC O3AGTG3A0A K-O 

25 CCACACTTCA CTTGAA3CCT TAGAAACCTT T3CCACCCAT G3TTCCAG3C CT<S3CTTCAT ISO 

GTD3GCATTT CT'CACCCCCA GAA'TA 3GCCG CCCG3CT3AA GAAACTACAA GA-3CAA3AGA 24 0 

AACAACAGAA AGTG3AGTTT 3GTAAAA3GA T* ^ 3 A 3 AA G 3 A G3TGTCAGAT TTCATTCAAG 3 00 

30 

A0AGT3OG3A GATCAA3AAA AAGTTT'CAGC CAATGAACAA GATC 3AGAG3 A3CATACTA3 360 

A.TGA.TGTG3T G3AA3TGG3T OG'^'^TGACAT 3 , ^TT , ~T'~'CTT T3GCGASV3AT GATGACTGT3 430 

35 »3CTATGTCAT < 3 ATCTT 3 AAA AA3GAGTTTG CACCCT3AGA TGAAGAG3TA GACTCTTA3C 4 SO 

GTCGTG3AGA S3AATGG3AC C3C'3AGAA3G CTGAG3AGAA 0O3GAACNT3 AA3GAG3TG3 54 0 

■3C3AiGAGG3A AN 1 3 A G 3 AG 3 A G3CAGCCCAG C\AGOG3CCTG T3GT3GT3AG CO CTGCCAGC 600 

40 

GACTACAA3G ACAAGT AC AG CCACCTCATC GG3AAGG3AG CAGCCAAASA CG3AG3CCAC 660 

AT3CTA3AG3 3 2AATAAGA2 CTAG3-G2TGT KTCC2C3TGG 2 2AATAA3A3 G3A0A3A03C 730 

45 T3 3A?rGAAG A3G3TAT3AA T 3A3A T CAS A 3CCAA3A\G2 3T3TG3 GGSA G\3CGG3GAA 730 

3A0 7TS3CS2 3AA3C T 3 3 IA ^^Z ^CCO:?^ 3 7AG2T2 3 2T rT3A3C7 3I'G GG3CA303CA 34 0 

G30G3CAGGG A3A3ACAA S3 CTG3TGCTAT TA3AGCCCAT 3CTG3A3C 3C CAC3TCTGAA 90 0 

OOA3CTCCTA CCAG3TGTCC CTCA GGOTGG 3GGAAAA 3 AG 3TGTTT3ATT T3TOA3 3 3TT 96 0 

Q3A0CTTO3A TATGTGCGTG OS AT 3TOTGT GTCTGTGTGA GAG T3TG AAT OCACAG3TG3 102 0 
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(2) INFORMATION FOR SEQ ID NC - 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1101 base pairs 

(B) TYPE nurleie arid 

( C ) STRAP IDE ONES J : doubl e 
(D.) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

GGGCACAGCT GAAGC TGCAG A33TCCCCAG GG3ATG3CT3 CTCTCCCCCA GGAGCCCCGA 

GGCAGGOGAG GCAGAAAGCC T3GX3CTCTG3 G3-3GTG3CCT GCGGAC A 3CT 3TGCTGTGG3 



ccgoggck;tg ggcctgtccc a3ag3gncgt ogag3T3Gtg gttctgagca <3ccagctq3g ibo 

TG3TGTCTGG GGATAGCTG:; GAGGIACAGC G3CTG3CATG TGGGACTGGG ACTG3AGTG3 240 

TC3CTQ3TCT TGGCCI CTGT G3CT :aG3GT T 3CTC TGGT C TG3CTGAGTG CAGGGGCCAA 300 

C-3r GGCACAGG GCCAGIGAGG 0O3G3CAC3C T 7GG3CCCTC ACCTGTGAGA TG3GGTCGGA 360 

ATTTKACACA GOTTANGGCT TGGTTCTT 3G TKGTNGAMCG TGGACTYCTK A3AACGGGAG 420 

T3CTGGTCCT GAAAG30GTG GTTG3AGACC A'3CTGCTTTT CTCG3TGTTT TTCTCTTAGG 480 

A3ATTAAACA AAAACAGAAA GCACAAGAC 3 AACTCAGTAG CAGACCCCAG A 3TCTCCCCT 54 0 

TG3CACA3GT GGTT33AGAC 3GGGAG ACG 3 A ;CTCGTCCA GAAC3GGATT C AGCTGCTCA 60C 

ACG3GCATGC OCCGG3GGC:: GT3CCAAAOC TOGCAG3GCT CCAG2A3GCC AA3CG3CA.CC 660 

AC3GA3TCCT GGGTG3CGCC CT03C3AA2T T iTTTGTGAT AGTT3GGTTT GJAGCCTTTG '?2 r - 

CTTACACGGT CAA3TA3GT3 CT3AG3AGCA T :GCGCA3GA GTGAG3CCCA G7CGCCGAGA 780 

CCCAAG3CGC CACT3A3GG3 ACCGGGOACC AGAGCGTGAC CTCGG3A3GC T G 5 AC AC ACT 84 C 

G3CCAG3ACA 3GCAGA 3CCA CCAGGCTCCT AG3TTTA3CT TTTAAAAACC Ti 3 AAAG3GG A 900 

A3CAAAAACC AAAATGTGTG ACTGG3CTTT G3AGGAG^CT GGAGCCTCAG CCCTGTCCTG 960 

GCCACGGGCC '3CTG3GG3TG GTGTGGGTGG G3CTTGTGTG CTGGATTTGT AGCTTATCTT 102 0 

CCGTGTTGTC TTTG3ACCTG TTTTAGTAAA CCCGTTTTTC ATT TT AAAAA AAAAAAAAAA 1080 

AAACTTTGGG GGGG3G3CCC N ,, ni 



(2) INFORMATION FOR SEQ ID NO : 12 0. 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 32 base pairs 

(B) TYPE : nucleic acid 

( C ) STRANDEDNE SS : doubl e 

(D) TOPOLOGY: linear 
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(>:i) ?FQITENCE DESCRIPTION: CEO ID NO: 120: 

A'G3TTCTCTG TCCAGTCTTG AA G I X3TGGG S TGTCTTGGAA CTTTCCTCAC CCCTCTCAGC 6 0 

CIGAATATTC CTTCCATGGA TT 2 C AC T C AA CCAGACTTTG CATCTGTGCC TACTTAATCA 12 0 

ACCTTATCTT TG 3 AA T AT 3 T TCl-GGCCCAC GTTCCACTCC 3TG3TTCT1G TTCCTCCT7G 13 0 

GCCTAACTTG TC2CTTCTCC A ITTCACATC CCC3GTGGGA CAGCATTCCT CCTTCCTCCC 240 

AACCTCCCTC CG TIT CAB A A AAAAAAAAAA AAAAAAAAAA TT ?8 ^ 



(2 J INFORMATION FOR SEQ IH NO: 121: 

( i ) SENTENCE CHAR ACT E F I ST ICS : 

(A) LENGTH: 2 €■ 3 5 base pairs 

(B) TYPE: nvcle-ic acid 

{ C } STRANPEDNEES : doubl e 
(D) TOPGLCGY : linear 

( x i ) SE ;-UEKCE DEECR I rT I CN : SEC 1 I D NO : 121: 

taagg3ggtg tgtgctcacc tcctcct3ac ccttaa2act cgtgtcctgc c3agaccaac 60 

agagagagct gtccctgaga oogo'ggag^ag aa3ca'g2tgc cgaaagctg3 agoctttccg 12 0 

cact3t3a3a ccatgatctt cctcctg3ca gggga3agcc acccacaog2 cat3tccag3 13 0 

ccca:ttc:g tcag:ccc2a g3gyttc3tt <3tgg3ccctg tgaggatt:c cta3G3ctgc 240 

3ccgcagag3 '3gyttcc3ca ag2tctgttt tgaa-ggctgc aatgtg3aaa agt3agaagt 300 

cagag3<3aac aggacag3tg cagc 2-30:;ct ctgag3ccac aoctcacacc t3g3tgttcc 36 0 

ccaa'iatc og ct3ag3agtg t3ag3tcatc tcaccagatg agaagag3cc c t 1 3tg 3 attt 42 0 

ytttt 3tttg tttgttgctg ttttcoccca ccgatccagt t0tcctcao2 aaa3caaatt 480 

cottaacacc tttogtggag aatttottac ccagacttg- 0 g3ct3t3at:; c 2cttcagtg 54 0 

cgtggtga:;t goag3GT3tg tgcgt:tp3-cc tgtgtgtgaa G3tg<3GGgc3 AT0 3TGGT<:;G 6 00 

cctg03ag ig tgag3agac-g cccc3tgtgt >0ctg3gtgag t3gt3ggtct g3g3tcaatg 66 0 

cagtgaggct ctct3ggtga q3ctcocaac ct03cagtcc ccagcctccc a> 3c atc tgtg 72 0 

AOGGTCTGTT GGACTTTACA GAA'3AGCCTC ATCCYGTCTG CCCCTCACT2 T3CCCTGGAA 780 

TCAACATCTT CCGAGTCCTT CTTGG3GGAA ATAGCAGAGC CCCACTTAAC TCCATAAACT H4C 
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tcccagotgg agttctggaa ctttcctcct cggggtc-ggg gtc-ggcgtttg ggaaggatc-- 1140 

tggggg:tCCt gg^ggaa^gaa gtagttcaga GGAAC- GGTGT CCCCTG7CCT CGGGATGTCA 1^00 

CCCTCGC-CTC CTC/AGACACG TGCTCTCTCT GTTCG' TTGGG T CTTCGGGGTG TGCACTr"^^ 1-60 

TGTGTCCTTG TAAATATGTT TTAGGAAGAA AGCAAAAGTAJ AGTGA--.GT.-jG CG7GGGGTAG 1?20 

GATTOCAGGG GTCCAGCCTT GCCTGTTTCC GAAGG2CCCA CAGGG-GGG7G 1 CC-C G CC ACTG 1^80 

AGACTAGTCC GCTCAAAAGG TAGAGAAA_-,C AGCAGGTCGC TGTGAAAGGTG A 1 G GGG GGC G 1440 

tcaaagtggc t/t/tttgttag ao\ac<;ttaa c^ggtgctca tgagcaaggt tggagatcgc- isoo 

TG GTT COTCA GCTCCTTGAT TTGTGAGGTT GA C GA AGGGG CCTGCCAGGC AGG 2CCTCGA 1560 

g?xc:t.;tc g tcgatc gct cgctccttcc tgggg:/gact cccgixaggtt agggaggtag 1520 

gg^aattagg ggcatgotgg aag aaggtt a a.ccagattgtt c aaag aag gg t*r?gt*pgct" l6?-0 

AG 3 TG 3 AT GT CAGATCTGGT AG:;TTGCAGC AGAGAAAATA AATGTGG GTT CAGAGACCAC 13 0C 

TCAGAGAGGG T GGAAGGGTG ATC-GAGAAGG A-GGGGX: TGGGAGCGTG GAAG-.GGAP.GG 1350 

GTTGTX^TG OTGGGATCTT GACTCCCCCC TGTTCT G CCA GAG GTXGG2-GG G7GGTCACCC 1920 

CYCTTCACTC GAG2GGGAGGT GCCTTCAGCC CTCCA7GAGC TTCACCCGCT TG GAACTTCA 193C 

GTTT'G^A.J^G 'G^TGGC^GTCC G ' TTG GG ATGA AJ2 ACGC-GGA Z CGTGTG-GTGG AGGAAAGGGC 2 04 0 

GCTGOXK,CT GGCGGCAGTC GC AG GGGAG A GACACCACAG AAGGAGA.2 2 G AGACACCCCG 21-10 

AGGAAGTTCC CAGCAGAGTA AACTOGTTTC CAGCCC GAAG G G T GG TG .AAA CCCCrGTXGA^ '>2^0 

TGCAATAACT GAGCTTAGAG TTAGGAATTG TX5TTCAASTG CTTGGATGGG CGTGTGTAGA 2280 

TTTAACTGCT GAAATTGTAT CTCTCAGTAA TTTGAGATGT CTTGTAAAAA ATGGAAAAAG 23 40 

AAAGTGTCAG ACTGTGT" 3CG TG TC-CGTTGA CGGGCACTCA AGAGTCCGGT GA.GTGATC GA 24 00 

TCCCTCCG7T TCCOCTGCGC CGGGATCGTG TCACGTCCCG CCCVGCCCCC AGTGGGGGAC 2460 

CCTGCCTG3T GTGGTCTTTA TCP GG C T A TT ACTCAC-CCTA AGGAAAGAAG TACACTCCAC 2520 

ACATGC A T AA AGGAAATCAA ATGTTATTTT TAAGAAAATG GAAAATAAAA ACCGTATAAA 2 5SO 

CAGGAAAAAA AAAAAAAAAA ACCC.NGGGGG GGOGG G GGT A AGCGATTTGG CCTAA 2 63 5 



(2) INFCRMATICN FOR SEQ ID NC : 122: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 9 94 base pairs 
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II.) TYPE: nucleic acid 
f C ) STRANDEENE S3: doub 1 e 
(Lf) TCPC.LCGY; linear 

(xi) sequence description: seq id no 
3aatt3ggca gag 3tt 2202 gaagataggg aataaggaag 

AAGCACAGSA 1 3TAG G G GAGA TATA2AG3 1 3G T 2 AG jATAA' I? 



S 2AAGAG3TG AAA2AA3ATG TGAGAGACAA 
GTTCA3AAG2 GCAIAGACCG TGG2G3ACG3 
'3OT5^:TA ttttaargga GATG3TCCTT 

cctc 2aggoc G2G3 3cg3a r atgtc 3tg :g 

CC2AC3TCCT TCTACGCTGC TGAAAGACTA 

TGATGTCGTG AAAAGA'TTCT TGTCTTTGGA 

CAAG2AAOAA CAGTTTATGA A3AAGATTGT 

GG2 TCG AATT ATTG7CTTGT CTGTCAAGAT 

TCGAAAG3A' 1 AAA 3C0 1 ZAC A AA 2< 3- 3TATCT 

GCTCAAAAAG CTCCGTAACA OTAACTATGA 

MTTGAGTAC AC 2TTCC ZCG CTCTGTATTA 

GAAG3CT2TG TG^vpTCGGG TTTTCOAG^A 

CTTAAAG3CT G2 AGO AG 2 AG CCCAAAAAOA 

CAAAGCCATA CCAAAGACAC TCAAAGACAG 

AAAAAAAAAA AAA\AAAAAA AAAAAGG3GA 



■gg3taggga 
caatg2gag 
agccctctt 
aaaccag3c 
cagaatgtc 
aatgg2caac 
'pgcaaaccca 
3cg2agttat 
g3taatgag3 
t3tctttgag 

GAAGAG2C 



122 : 
CACAGGAGTA 
AAA 1 GOG 
A 1 j A WTO 3* G 3 
GGG2ACAGAA 

YTTrrcT3CG 
ca 3Tctagg2 
cctg3aattg 
aa 3aag3aga 

GAG3ACACCA 
GAAGAACACT 
ATTGACCAGA 
AAGATATG2T 
GACCG2CGAT 



1 ACTCAAA A< 3 CT 3 AAG AAC 



AGCAAAGC3G 
AATAAATT 



AG3AACCCAG 
CTGTTCAATC 



GGG GA 3 AAG3 
3G3TG3TTG3 
2 A G303TTAG 

ag3aactgag 
ta:jttctc::t 
tg3atgatga 
agaa 3gttga 
tg2taaaaat 
gatccctg3a 

T G 3 A 1 3 AAA : 2A 
G3AAAAAGAT 
GGGGG7TGG3 
TCGTGACCAA 
G AAGAAGA- 2C 
A2AG2GCTG2 
ATTTAAAAAA 



60 
120 
130 
2-30 
200 
3 60 
420 
430 
S4 0 
£00 
£■60 
720 
780 
840 
900 
c '60 
994 
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(2) irTFOHMATTON F^'R OE 2 IDT IO : 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1542 base pairs 

( B ) TYPE : cue leic acid 

(C) STRANE EDNESS : double 

(D) TOPOLCGY: linear 

(XI) SEQUENCE DESCEIPTI 2N : SEQ ID NO: 12 J 
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C TC-G AC ATCT GATGAAACAG TGGTGGCTGG TG3CACCGTG GTGCTCAAGT GCCAAGTGAA 3 00 

AGATCAC3AG GACTCATCCC TGC AAT'3GT Z TTAAC3CTGC TCAGCAGACT CTCTA.CTTT3 360 

GGGAGAAGAG AGC2CTTCGA GATAATCGAA TTCAGCTGGT TAMCTCTAOG CCCCACGAG3 42 0 

TCAGCATCA 3 CATCAGCAAT CT03GCCTGG CAGACGAGGG CGAGTACACC TGCTCAATCT 480 

TCACTATGCC TGTGCGAACT OCCAAGTCCC TCGTCACTGT GCTAGGAATT CCACAGAAGC 540 

CCATCATCAC TGGTTATAAA TCTTCATTAC 1 3Q 3AAAAAG A CACAGCCACC CTAAACTGTC 600 

A jTCTTCTCG GAGCAAGCCT GCAG7CCGGC TCACCTGGAG AAAGO 3 TG AC CAAGAACTCC 660 

ACGGAGAACO AACCCGCATA CAGGAAGATC CCAATGGTAA AACCTTCACT GTCAGCAGCT 72 0 

CGCTGACATT CCAGGTTACC CG3GA3GATG ATGGGGCGAG CATCGTGTG2 TCTGTGAACC 780 

ATGAATCTCT AAAG3GAGCT GACAGATCCA CCTCTCAA2G CATTGAAGTT TTATACACAC 840 

CAA2T3CGAT GA7TAG3CCA GACCCTCC3C ATCCTCGTGA GGGCCAGAAG CTGTTGCTAC 90 0 

ACTGTGAGGG TC;J03O2AAT CCAGT 2CCCC A3CAGTACCT ATGGGAGAAG GAGGGCAGTG 960 

T3CCACCCCT GAAGATGACC 1 Z A< 3GAGAGTG CCCTGAT3TT CCCTTTCCT: AACAAGAGTG 102 0 

ACA3T3GOAC GTA ::GG0TG2 ACAGCCACCA '3CAACAT3GG CAGGTAGAAG ' 3C C TACTAC A 1080 

C' 2CTCAATGT TAATGACCCC AGTCCGGTGC CCTCCTCCTC CAGCACCTAC CACGCCATCA 1140 

T3GGTGGGAT CGZ -3GCTTTC ATTGTCTTCC TGCTGCTCAT CAT3CTCATC ■TTCCTTGGCC 1200 

ACTACTTIAT CC:;GCACAAA Q3AAC2TACC T< ?ACAC ATGA G3CAAAAGGC TCC3ACGATG 1260 

CTCCAGACGC ^ACACGGCC ATCATCAATG CAGAAO30GG GCAGTCAGGA <3Q3GACGACA 1320 
AGAAGGAATA 'rTTCATCTAG AGGCGCCTGC CCACTTCCTG CGCCCCCCAG C<3CCCTGTGG 



GGACTTG 2TG <3030CGTCAC CAACCCGGAC TTGTACAGAG CAACC3CAGG GGCO 



CCCGNTTGTT CCCCAOrCCA GCCACCCCCT TGTTACAGAA TGTYTKGTTT GGGGTGCGGT 



1330 



'GSCCCT 144 0 



1500 



TTTGTOATTG GTTT1 J03ATN 03GGAA03GA GC^ANGGCCX; GG 1542 



(2) irJFORMATICN FOR SEQ ID NO: 124: 

( l ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13 9 0 base pairs 
(3) TYPE : nucleic acid 
CO STRANDEBNESS: double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

CAAGCTCTAA TACGACTCAC TATAGGGAAA GCTGGTACGG CTGCAGGTAC C , 3GTCCGGAA 60 
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TTG r 3CGG* JTC 
2TCGTCATGG 

1 o ag 3 AG 1 otgo 
agag3tg3ag 

GGAGGGGAGO 

ggggag3ag3 
G2Taagaaag 
gag3gtoaag 

G3AG3AG0G3 
GGAG2AG3GO 
G3AG3AAG3G 

gat oaagtag 
::ta::g-:a:t 
a3gt 3tgatt 
ggg3aagtt3 

OTO 2CTCATC 
220TOTTG3A 
ATC 2T0GG3A 
GAGGTTG3TG 
AATTTAGGOT 

AAAAAOT30A 



tg3G3Got y 



TOGO X5CCTC 
OT3GTAOTTG 



:TC' 3GA 



OA'3GAG2AG3 
0tGAG3GGTGG 
G3GT3GG :tg 
AG3AAG3T 3T 

tg3g3aannt 
gtgarga*3GG 

GTTG :.CGTGG 

oa3G3ggag3 
gtag3agaga 
at 3aag3agt 

GAG 3 AG A- 3G A 
■ j A ^'.jACOj-j- j 

atg ioaga3g 

2TGAGAGTTG 
AGT J AT 3GTG 
TG3GTTG3T3 
TGA3AATATA 

rATG/^A rCCA 



GGG3GTGGGO 

G3OAGAAG0A 
G'.jAgAA 1 Z*Z _ A 
G3AG3AGAAA 

gaaaggagtg 

AG3AG3AG3A 

atga 1 3gagta 
ggatgactga 

' 3 ■ 3 AA- 3GTT' }T 
TAAATG-G3AT 
GGAAGTTGAT 
GGG3GGGG 3T 

3TGT3GG3TA 

tgg3gagg2a 
tgg::agaag3 
tocgagaggt 



A3GGTG3AGG GATG3TTOTG CACTGAGGCC 

GTAG0G3GG3 GTCTGGTAGT OGG"TTTATO 

G3AT3AG30G G3GAAGA3GG AGTG2A-3AAT 

'2AG3GTG3GO 0O3T3GAG3O TGA3GAG3GG 

gag0tog3ga g30g3gtaga g3gooag3gt 
' 3 at' 3agaa3 1 z a' 3 3 ag 3aa ^ ": tgtgatcgt a 
■ 3< 3g3 aa^ tv- 3 a' 30tgt 3gg 3 avttg 3a 
oaag3G3Gaa. agogggaggk tgag3gagag 

' 3 AG TOG 3 AGG G3' jM' r OA 3T O 3AA 3 AA 5 ZA 
GAAG3AG3AG GAG3AGAG3A A3GGGGG3GA 

ootgaaagtg aa3gag3g 3t ttgtggtg3a 
g3aa3agtgo gagag3ttgo t3agaga3tt 
g-gtottg3aa gaootgg3tt cooaggtggg 
g3ag3a3gt3 gt03gtoag3 g3agtataa3 
gta-2ataac0 g-2a3aggaa3 toggog3 3gt 
3T rcvro 3 30 gag3TT3Go 3 a a o 2 gag 2 a^ 



GGTGG2TATA oatcttcat 3 ggtgo3oago 

GTTATA3ATT AAAG3G 2TGT 'GA3TAGTG0T 

CC TZGCC'T A Z > Z ATG Z TAG AT AAGGAGGTGA 

GGGGAGG3TG OCTTGGAAGC TGGT3AAGTO 

AAZ GyTA^t*^. I" ■ jTTG-._ AAAvA AAAAAA'-lt'AA 



130 
24- 
30 - 
3 60 
42 ":■ 
48- 
54 0 

660 
72 0 
7 8U 
840 
90 0 
960 
1020 
103 0 
1140 
1200 
1260 
1320 
13 8 0 
13 90 
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(2) INFORMATION FOR SEQ ID NO: 125: 



(i) GEQUENCE CMARACTE R.I ST1G3 : 

(A) LENGTH : 12 R3 base pair? 



60 
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378 



GACGCTGACC ACGTTCOTCT CCTCGGTCTC CTCCGCCTCC AGCTCCGCGC TGCCCGGCAG 12 0 

CCG03AGCCA TGCGACCCCA G03CCCCGCC GCCTCCCCGC AXOXJT-rCG CGGCCTCCTG 130 

5 

CTGOTCCTGC T O 2TQ 2 AGC T GCCCGCGCCG TC:;AGOG:CT CTGAGATC<~C CAAG2GGAAG 240 

CAAAAGGCO: ATCCG02AGA <2GG AGGTGGT GGACCTGTAT AATGGAATGT i3CTTACAAGG 300 

10 GCCAGCAGCA GTGCCT' ^GTG GAG AC GGG AG CC ::TGGGGCC AATGTEATTC CGGGTACACC 3 60 

TGOGATCCCA '3GTCGGGATG GATTCAAAGG AG AAAAGGO 3 GAA r P* 2 TCTGA GGGAAAGCTT 420 

T1A'GGAGTG2 TGGAGAGCCA A 2TACAAGC A GTQTTCATGG AGTTOATTGA ATTATGGCAT 4 30 

15 

AGATCT'I>GSG AAAATTGCGG AGTGTACATT TACAAAGATG CGTTCAAATA GTGCTCTAAG 540 

AGTTTTGTCC AGTG^OTCAC TTCGGCTAAA AT'GCAGAAAT GCATOCTGTC AGCGTTGGTA 600 

20 TTTCACATTC AATGGA02TG AATGTTCAGG ACCTCTTCCC ATTGAAGCTA TAATTTATTT 650 

GGACCAAGGA AGCCCT2AAA TGAATTCAAG AATTAATATT CATC G2ACTT CTTCTGTOGA 72 0 

AGGACTTTGT GAAGOAATT^ GTOCTGGATT AGTGGATGTT GCTATCT3GG TTGGCACTTG 7 B0 

25 

TTGAGATTAC CCAAAAGGAG ATGCTTCTAC TGGATGGAAT TCAGTTTCTC GCATCATTAT 840 

TGAAGAAO T A CCAAAATAAA TO 2 TTTAATT TTEATTTGCT ACCTCTTTTT TTATTATGCC 900 

30 TTGGAATGGT TCACTTAAAT GACATTTTAA AT AAGTTT AT GTATA 2ATCT GAATGAAAAG 9 6 0 

C AAAGC T AAA TATGTTTACA GACCAAAGTG 1 X2 ATTTCAC A TGTTTTTAAA TC T AGC ATT A 1020 

TT2ATTTTC-C TTCAATCAAA AGTGGTTTCA ATATTTTTTT T AGTT j 3G TT A GAATACTTTC 1C30 

35 

TTOATAGTCA CATTCTCTCA ACCTATAATT TGGGAATATT GTTGTGGTCT TTTGTTTTTT 1140 

CTCTTAGTAT AGC ATTTTT A AAAAAATATA AAAGCTACCA ATCTTTGTAC AATTTGTAAA 1200 

40 TGTTAAGAAT TTTTTTTATA TCTGTTAAAT AAAAATTATT TCCMACAACC TTAAAAAAAA 1260 

AAAAAAAAAA AAAAAAAAAA AAAAATIAA 128 8 



45 

(2) IirFCRMATION FOR SEQ ID NO: 12b: 

{ i_ ) S EQTJEtJC E CHARACTER I ST ICS : 
50 (A) LENGTH: 1517 base pairs 

(B) TYPE : nuclei = acid 
CO ST RANEE DNESS : double 
(D) TOPOLOGY: linear 

55 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

AGTGGCTTAA AGGCATCGTT TTAGGGATTA CTQ3GAAGTA TCTTCAAAGT AATACATGAG 6(5 

AAACATTCCT TCCTAAATCC TTTATTATAT TGAATATCGT ATTAATTGGT TTTCAGAGGT 12 0 

60 



WO 98/54963 



PCT/US98/11422 



379 



10 



30 



40 



?AAATTAACC ATG T ATTC C T GCAATAAATG TCACTTGTNT CTTGTATATA ATCTTTTTTA 
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AGATGCTGGT CTOCAGTTTT CTTTTTTTGT GATAATCT'GG TTTTTGTATC AGTA^TACAG 
GCCCCATGAA ACGAGTTGGG AAGTGTTC A 2 CTCTCTTGTA TTTTTTCAAG AGTTTGTGAA 



GGTTTTTTTT GAGGGAAGTG TTC TG AT AA C TAATTCAGTA TCTACTTTTT ATAGCTCTGT 



^ifcrkation for se; ID NO: 127 ; 



180 



TATATTACGG GATTGATTCA TTAGTATTTT GTTGAGGATT TTTGTGTZTA TATTCATAAG 24 0 



300 
360 



GAATT^CTAT TAATTCTTTA AAT GTTTGG T AGAATCTACC ATTGAAATCA TGTGTCCTG3 42 0 



480 



TCA 1 3 ATTTTG CTTCTTCCTG AGTTAGTTTT GGTAATTTGT GTATCTCTAG GARTTTGTCC 54C 



600 
66'.) 



1 5 ATTTCATTTA TCTC ATTTG T TG3CATAAAT TAAACTAAAT TTGrGCCTGAG CCTACCTGTA 
TATCTTGAGT CGCTCTGTAA G3AACTGT AG CCTAACTTGT AC AT AAA< 2 AA ACTGAAATCC 
TAAATTAGGA ATGTAGTTTT TGTAACAGCT C CTGAGTC T C AGGCAGTCA2 AGCAGYCAAG 72 0 

TCTGTCAATT GCAGGCTGCT AACTAA'GCAG CCCATGSTCA AATGAGGCAA AAACCTTTGC 



7 8 0 



TTTTAACACA TAGTATAGCT TTGTAATCCT TTTCTTGCAC ACTCGGGTAA TTTCTTCCTT 84 0 

TTTCATTCCC KGWATTTTCC AKGAATATCA RTCTYCCTTT TTTCCCCTCC TGTCAGTCTA 900 

GC T AATO j TT TGTCAATTTT GTTGATCTTT TGAARAACAA AC CTTTGGTT CCAGTTTCTT 960 

GTTGCATATG CTGARTATTC T CAT AATT GG AGTGGAAAGC TGATCTTTGA TTACTTATTT 102 0 

TACTTAQ5GC TGAC5GAGTTC ATGGACTTCG CAAAACCTCC TTGAATCTAA ATTGCATCTT 10P0 

r^^^T TTCTGGGCTG AAACATGTTT TTTCCCATCT WANAWACCCT TGGTCTTTTC 114 0 

35 ATKG3CGATT AAGACTAGAG AAAGTTCTAG ATMCCTTGTC CTTTTATGCT GTCATTTTGT 1200 

TTAAAGGCTT TCTATGTAGT AAAACTATCT ATATAGACAA AATAGAGCCT TGA3TTGTGG 1260 

TCTTGAATTT GATCAACATG ATTTACCACA TTCTGTACTG GATATTTCTT -ZACCTGCTGC 132 0 

TACTGTAAAC CATTTTATTC TTGGATCTTC TGTAGAGTAT ATT ATC AC AG GTACTTTTTA 13 80 

CAGGCGTGTC TAATCTTTTG GCTTCCCTGG GCACATTGAA AGAAGAAGAA 1TGTCTTGGG 144 0 
45 CJLACACATCA AAT ACGCT A A CACTAATAAT AGTTG ATG AG CTAAAAAAAA AAAAAAAAAG 
: . j'J: vYtlAAA'-ji J ; . C C AAAA 
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1517 
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7GA.ATCTATT CTTTGAACAT TCTACAACAA GAATTACATT ATACTGTTAT ACCAGAGTAC 6 0 

TTGTGCAGTG TGAAATAGAT T<3GTTTGGAA AATGAACCTG GCTTTG3TAT AA YTT ACAT* T 12 0 

5 

CACAGGCCTT TTTGCAAATG TGTAACTTGC CTATCAAAGT A 3TTTG' rAGG GCAAATGCAG 180 

AATATATGTC TCCATCTGGT AAAGTAC JTT UTAYTCATGT G 3G AAAT< J AA GTAGTATCA 1 j 240 

10 AACTT>3GTCC AATA3TC3AA TTTGTTAAAG CCAAGGGCCA TTCTCTTAGT GATGGGCTGG 3 00 

AG j AAGTC C A AAAAGCAGAA ATGAAAGCTT ACATGGAATT A3TCAACAAT AT<3CTGTT(3A 3 60 

C~3CAGAGCT GTATCTTCAG T3GTGT3ATG AAG3TACAGT A 3G3RM< 3ATC ACTCAT3MTA 420 

15 

GGTATGGVTT 3 TCCTTA3CCT T3GCCTCTGW '."7TC AT ATTTT G3CCTATCAA AAACAGTG3G 4 80 

AAGTCAAACG TAAGNTGAAA GCTATTG3AT 1 3 3G3AAAG AA GACTCTGGAC CA 3GTCTTAG 54 0 

20 AGGATGTAGA 3cagtgctgt CAAGCTCTCT 3TCAAAGACT G3GAACACAA GCGTATTTCT 600 

TCAATAAGCA '3CCTACTGAA CTTGACGCAC T3GTATTTGG CCATCTATA3 A3 3ATTCTTA 660' 

CCACACAATT 1 3A 3 AAATGAT GAACTTTCTG AGAAGGTGAA AAACTATAG3 AACCTCCTTG 720 

25 

CTTTCTGTA3 GAGAATTGAA CAGCACTATT TTGAAGATCG T3GTAAA3G3 AG 3CTGT 3 AT 7S0 

AGAGTTATGT GTTAGTOTCA G3AGTCTTAA CTTTTGAAAT ATGTTTTACr TGAATGTTAC 840 

30 ATT A< 3ATATT G3TGTCAGAA TTTTAAAAC 3 AAATTACTGC TTTTTGAAAC C TC AAATT AT 900 

A T AATGT AT C TTATGTATGT GCTTTATATT GTTATTTGTG TAT AC ATT AA AAT AATTCT 1 3 960 

AATTATTTAA TCTGATATGT TGTATTCTGT ATCTTGAAAT TTTTGTTT C C TTGAAACATG 1020 

35 

CATGCATTTA AAAAT AAAGC TTAAACAACT GTAAAAAAAA AAAAAAAAAA CTC 107 3 



40 

(2) INFORMATION FOR SEQ ID NO : 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 
45 (3) TYPE: nucleic acid 

( C > STRAME EDNES S : doub 1 e 
{ D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

50 

CAACCCCTGC CTTTTTTTTG TTTTCCATTT (3CTTGGTAGA TCTTCCTCCA TCCCTTTATT 6 3 

TTGAGCCTAT GTGTGTCTCT GCCCGTGAGA TGAGTCTCCT GAATACAGCA '3ACTTACTGG 12 0 

55 TCTTGACTCT GTATC CAATT TGCCAGTCTG T3TCTTTCAT TTGGAGCATT TAGCCCATTT 180 
ACATTTAAGG TKAATATTGT TATGTGTGAA TTTRATCYTR TCATTATGVjT GTTAGCTGGT 24 0 

TATTTTGCTT GTTAGTTGAT GCAGTTTCTT C 3NGGC ATC A ATGGTCTTTA CAANTTGGCA 3 CO 

60 
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35 



AAC AT AGT AA GACTCCATCT CTACAAAAAA AAAATTTTTT TTATTATACT TTAAGTTTTG 
GGTTAGATGT GCAGAACGTG TAGTTTTGTT ACATAGGTAT ATACGTGCCC TGGTAGTTTG 



60 



(2) INFORMATION FOR SEQ ID NO: 129: 

s 

v i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1275 base pairs 
(3) TYPE: nuclei: acid 
(C) STRANDEDNESS : double 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

GGCAGAGGCT GTCCCT3CTG CCCOTGCAAA AAAAACCCCC TCTGGTGTGA GCAGGATGGT 

15 

TGGAGGTTAT GTGAGCTCCT TC2CCTTTCC TCCAGTTTCC TCTTCCCTTC TCCTCCCT3C 120 

ct2ttttgct tttccctttc ttcctggtac cccctgccca ttcctgtatt ttct2c-2atc 180 
20 gccattctcc cctctccca: tgtc:ctaac ccgttcaaac tctttcctct taaatggtt:; 
a 2attttctc tcaccaagca caccccagta ttaattaaac tagctgcaaa caggcagcaa 
gtggtctacc atgacagatg ggttttgtgt gtgtgtgtgt gtgtgtaatt gtaataaaac 3 60 

25 

ATATTGAKTC ACTCAATAAA CACAGAGTGT CTACTACATG TATCARGCAC TATCATAGAT 420 
GCTAATTAAC G AAA C TGAAA TGGCCAGGCC CTCACAGTGG CTCATGCCTA TAATCCCAGC 



240 
300 



480 



30 AC T TTGGG AG GATGAGGCAG GAGGATCACT TGAGGCCGGG AGTTCAAGAC CAGCCTGGGC 54 0 



600 
660 



CTGCACCCAT CAACCCATCA CCTACATTAO GTATTTCTCC TAATGTTACC CCTCTCCTAG 7 20 



780 
840 
900 



ccccccaccc cotgacago-c cctggtgtgt gatgttcccc tccctgtgtc catgtgttct 

40 CATTGGTCAA CTCTCACCTA TGGAGTGAGA ACATGTGGTA TTTG3TTTTC TGATCTTGTG 
AT A GCTTG C T GAG AATGT; 'G GTTTCCAGCT TTATCCACGT CCCTGOAAAG GGCATAAACT 
CATCrCTTTT TATGOTTGCA TAGTGTTCCA TGGTGTATAC GT< '. XZ C AC AI T TTCTTAATCT 960 
ATCATTGATG GACAA jTTTT C-CTATTGTGA AT AGTGC C AC AACAAACATA CGTGTGCGTG 1020 
TGTCTTTATA G2AG2ATGAT TTATAATG IT TTGGGTATAT AC2CAGTAAT GGGA7CACTG 

50 AGTCAAATGG TATTTCTCGT TCTAOATC 2G TAAGGAATTG CCACACTGTC TTC C A- 2 AATG 



1080 
1140 



TTTGAACTAA TNTACACT2C CACCAACACT AAAAGT 



-TATTTTT CCACAACCTC 12 00 



60 
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(2) INFORMATION FOR SEQ ID NO: 13 0: 

( : ; SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 2 base pairs 
5 (B) TYPE: nucleic acid 

( C ) STFANDEDNES 5 : doub 1 e 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

10 

CNGAAACCCC GTGAACCCTC CCCGGGTTAA AAAGCCCCCC CT AAATGGO 3 GGAACGCYTC 60 

ACACGTTATA AAAAAGCACT A2AATGTTTT GAAAGCGAGA AACAACAGCT GTGTAGGGTA 120 

15 'OCTAGCAGTT AGTGTTGTAC A : 3AA< 2 AC AG A TATTTGTG2A TTTYTGCAT r TTCTAAGTTT 130 

GCTGCAATGA ' 3C ATGT ATT A CTTTCATAGT TATAAAACAC ATGCAAAAT3 CCCTTTTAAA 24 0 

ATGAAAAAAA ATCCATGAGT GTAAGTGATA TATATGCTTT GGAAAGCCT3 GGACGGTCAT 3 00 

20 

TGTTTACTCT CAATAGTATG TGTTTGCCTT TGTCTTTTTG AGACATTTTG TTTTAATCTG 3 50 

TTGATGACAA TAACCTGTTG ATAATATAAC TTGATAACAA AT AAAATG AC TTATGATTGA 42 0 

25 AWMAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA NN 47 2 



30 (2) INFORMATION FOR SEQ ID NO: 131: 

(:) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 1950 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRAIJDEDNESS : double 

(DJ TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 



40 ACCTCTCAGA ATCTTCTCTC AGCAACCTGA GTCTTCGCCG TTCCTCAGAG CGCCTCAGTG 60 

ACACCCCTGG ATCCTTCCAG TCACCTTCCC TGGAAATTCT GCTGTCCAGC TGCTCCCTGT 120 

GCCGTGCCTG TNATTCGCTG GTGTATGATG AGGAAATCAT GGCTGGCTGG GOACCTGATG 180 

45 

ACTCTAACCT CAACACAACC T07CCCTTCT GCOCCTGCCC CTTTNTGCCC CTGCTCAGTG 240 

TCCA3ACCNT TGATTCCCGG CCCAGTGTCC CCAGCCCCAA ATCTGCTGGT GCC AGTGGCA 300 

50 GCAAAGATGC TCCTGTCCCT GGTGGTCCTG GCCCTGTGCT CAGTGAC CGA AGCTCTGCCT 360 

TGCTCTGGAT GAGCCCCAGC TCTGCAACGG 1 3CACATGGGG GGAGCCTCCC G3CGSGTTGA 420 

GAGTGGGGCA TGGGCATACC TGA3CCCCCT 'GGTGCTGCGT AAGGAGCTGG AGTCGCTGGT 480 

55 

AGAGAACGAG GGCAGTGAGG TGCTGGCGTT GCCTGAACTG CCCTCTGCCC A 2 CC CATC AT - . 54j0. 

CTTCTGGAAC CTTTTGTGGT ATTTCCAACG GCTACGNCTG CCCAGTATTC T AC CAGGC CT 600 

60 G3TGCTGGCC TC CTGTGATG GGCCTTCGMA CTCCCAGGCC CCATCTCCTT GGCTAACCCC 660 
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T'GATCCAGCC TCTGTTCAGG TACGGCTG3T GTGGGATGTA CTGACCCCTG AOCCCAATAo 720 

ctgcccacct ctctatgtgc tctg3agggt c2acagccag atcccccagc gggtggtatg 78 0 
gccaggccit gtagct-gcat cccttagttt ggcagtgttg gagtcagtgg tgogg3atgt 84 0 

tc^actcaat gaagtggaca agggtgtggg gitcctgctg gaaagtgiag ggocoocagc 900 

CACTGXCIG CAC0TG3AGA G3GGAATCTA C G GTGA GAT A TTATTCCTGA CAAT3G3TGC 960 

TCTGGOCAAG GACCACOT3G ACATAGTGOC CTTCGATAA 3 AAGTACAAGT CTGC^TTTAA 102 0 

C AAGC TCGC G AGCAGCATGG G 2AAG;; AGG A GCTGA<3GCAC CG3CCGG0GC AGATGCCCAC 10 80 

T2CCAAOGCC ATTGACTGCC G AAAATGTTT Ti 3G AGG AC C T CCAGAATGGT AGAGACCTTA 1140 

AGCTTCCC'IC TGGAGGGTAG G3TGGCGAAG T 1 3 AGG AAG AA GGGA'ITCTAG A3TTAAACTG 1200 

CTTCCCTCTT GCCTTCATGG AGTTOGGAAC A ZGC T'GGG AA GGATGOCCAG TCAAA<3GCTC l^ov) 

CAAGCGA3GA CAACAGGAAG AGGGATCCAC TGTTACCAAA AGTCGTGATT CCCCCATCAC 13 20 

CAACCTACCC AGTTTGTTCG TGCTGATGTT G GGGG AG ATC TGGG3GGAGT TG 3T AC AGCT 13 8 0 

CTGTTGTTCC CTTGTCCTAT A3CGGGAACT CGCCTCCAG3 GTACGCACAG ATCTGCATTG 1440 

C CCTGGT G AT TTT AG AAGTT TTT 1 3TTTT AA AAAACAACTG GAAA3ATG2A GA'.3CTACTGA 150(3 

CJCCTTT? JCCC TGAATG 5G AG GTA3G3ATGT CATTCTCCAG CAATAATG3T CCCTCTTCGC 1560' 

TGACGTTOOT GAAGGAGGCC AAG3CTCTCC ATGCCTTTCT ACCTAAGTGT TTGTATTTTA 162 0 

TTTTAAA7TA TTT ATTOTO 3 AGCCAC AGG C CCCTTGCTTA TGAG3TT:TT ATGGAGAGTG 1630 

a:;aaag *;aa gggaaatagg c;caccatggt ccggtggttt gtagttcgtt caaagtcagg 1740 

CACTGG3A<3C TAGAG:;AGTC TCAAGCTCCC GTTAGGAAGA ACTG3TG2CC CCTCCAGTCC 1?00 

TAATTTTTCT TGGCTGCCCC (3CCTTGGGGA ATGCCTCACC CACCGAGGTC CTGACCTGTG I860 
CAATAAG^AT TGTTCCCTOC GAAGTTTTGT TGGATGTAAA T AT AGT AA AA C-CTGCTTCTG 102 0 

T < 0 TTT* rr 7" AA A A : I AAAAAA A AAAAAAAA G T 1 ^ ^ ^ 

(2) INFORMATION FOR SEQ ID NO: 13 2: 

( l ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH- 990 ba:je pairs 
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3S4 

TTGAG7CCT TACT ATT A.TG A77-.TTTTCCT TATTATTTCT ACCAATGCTT CTT AT ATT AA 120 

AAA.TGCA.TC 7TTC-CA.77GT TTGATCTTAA ACTTTTTGTG TCTTTATATA AGGTATGCTY 2-10 

T77TA7C2-CA 7 GAT A. TTT TT AATCACAATA GTTGAAAGAC AATCTY C AC C TTTTACTTGT 3 3) 

TATTTATAT GTAA7GTAA.T TTTTGATGCA TATTACGTCT TATTATTTAA CCAACCTATT 36 0 

7TATTTTA7C 7AT-C-CTATTT TTCA.GAAAGC CTT ATTTTC T TGTATTAATC AAATATTTTT 4 2) 

A77CA7TGTAT 7TTCCi r 2TAT T AGTT AG KAA TACGKTACYC Y AAAT AT AT A TTGTGG3TAT 4 3 3 

15 TTTCAGAATT GCAATATC-CC TC CTT AATTT ATTAGAGGCT AAC CT AAATT ATTACTTTTA 54 0 

C CA. CTT AGTT GAA_~A7TCTG CAACTTTAGA AC ATTT ATT 3 TTTTATGCAT TTTAATTCTA 60 0 

CTTGTA7TTT 7ACTAC7CCT AA.ACATTATT ATTGTTTTAG ACAAGC CAA A ATATATNTTG 66 0 

TT A 7 T AC CTT ATYCTC7ATT TCTTTCTGTA TTTTTATGCC ACTATGTATG CTCAATTTC C 72 0 

TTGTATGTG.A 7GAA.CT~AA7 TGAGT ACTTT TGTTTTTTAA TCTGTGCAGG TAGCCTGC^C 730 

ATTAAA7TTT 7ATTTT7GGT T7GCTGAAAA AATTGTGTTT ATTTCTATAT GC AT AC TT AT 84 0 

GCATA.TAGAA, TCICTAGCTCTG A.CATATTTTT AGTATTTATA AATGTAAAGT CATTWATT KG 900 

CC7T:TA7CA. TTT 7777 GA GAAATCAATT GTCAGCCCAA TAGTTTTTCA TTTTAAATTA 960 



20 



30 



45 



(2) i:^c?:-i-.rrc:i ft?, seq 7d no: 133: 



( i ; 5ZCL7C:;-CZ CHARACTERISTICS : 

(A; L25G;GT>:: 1720 base pairs 
40 (3} TYPE; nucleic acid 

( C ) STTWCECMESS : double 
( D ) TOPOLCGY: linear 



(xi) SZQUTCICE DESCRIPTION: SEQ ID NO: 133: 

G TCT GAT AAjG CaACTTTGGT TATTCCCCTA AAG TTT ACTT CAGCACTAAC ACTAGTGCTT 60 

C C GCTGGLAG T TTGC.AGTTTT C CAGC TTT AT ACAGGATTTT CCTTTGACTG GAAGAGTCAA 12 0 

50 GGA7ATAGAG ACTGA A.CAGT GACATTTATT GTACAACATC AAGGGGAATA (3GATACTCAT 180 

CAAACTGGGA 7TATTCTTAT CAAAACATGG TCTTCTTTGA ATAAGAAAAA TACATAGTTG 240 

GTTATTATGG ACTTAAAACT G7GTTAAATG GATATTCTGA TAAAATATTT GCTGCTCTGT 300 

55 

AGAjCTGTOGA AAATCT GA.GA A.7ATTAGCTT TACTCATCTT GAGCTTTGAG iCATGTTCTCT 360 

GTACC-CCGAT GGTTTCATAT TAA.CTAAAAA AGCTGGGTAT TGTAAAATCT CATTTATAAA 42 0 

60 AA.C7 ' CA-GATG AGAA-GAAAAT TTTCTTTGAT GGTGAGACTG TTGTCTTAGT TCAGGAAATT 480 
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5 



25 



35 



45 



(2) :NFORKATION FOR SEC" IU NO : 13 4: 



(i) SEQUENCE CH/\RACTERISTICS: 
50 (A) LENGTH : 70 5 base pairs 

(B) TYPE: nucleic acid 
: C ) STR AICDEDNE3 S : ;i : > ub 1 o 
( D ) TO POLOOY : linear 



54 0 

600 
66 0 
7 t"3 

7 80 
8. 10 
900 
960 



ATTTAATAAT CCTTTGTTAC C TGTG AATG A AGGAACTTTG TAATTCTGAT TTATCGTAAA 
ACATGA-3CCT TTCCAGAGTC AGCTTAGACA CT3TTGTCGC AAATAGCCAT GCTTT-3CCTT 
ATGCCAAGGA G3CCCAGAG3 GA3GGCGTAG TCTTCCTCTG TTGCTG T ACA T AT ATT GAJVA 
T'XTTHTTT TTTTATTTTG CATTTGTTAT CT AT AATG A 3 CTTTCTGAGC CCTGATATTA 
10 T'3TG AG AC AA ACAGG AGTT A TT 3ATGTTAT ACACTCCCTT CCATTCAGGA TTTTCTGCTT 
Q3ACGGAAAT ATGTTGACCT TAGAGAATTG TGI iATATTGT TGCAATTCTT GAATATATTA 
CCATGTGAAT AATAGAGACT GTGTTGCTCT CTAGTATAAG CTATATTTAT TTTTG A TTC A 

15 

TTTOAATTAC TAGTTATAAC TG3AGAAATT TT GTT ACCTC TATCCTGGCT TGCCTGACTG 
GCTGTATAAT AGCAGCAGCC TCTTTTAGAG CATCTTAATG AAAACATGGA TGAAAGGAAT 102 0 

20 TAATGATGAT ATC TO 3 AG AC TC-CGTAGAAA AT'3GCTTTTU TTCCCAGCGT TAACATTTTC 1080 
TTCTCAATCA CATTTIAATG TTTGTGGAGA GTGGCAGATT CACACCAGAA ACACTAGGTG 1140 
TTCATATCCA TAGCATGGAT GCAGAATAAG CAGTTGGGAG AGAACXTTTCT TCCTACCTGG 12 00 

TACTCCTCCC ATT C ACCTC A GCCCAGCCCC AGACAGGCGT TAGCATTCAG TGTGGGCCCT 12 6 0 

CAa^CAGCCC TGAAGCCTG3 CTGGGTCATC AGATGGGGGC AGC C T GTG AC GGGCACCAGC 1320 
30 GGCCT3ATTC CAGGGAAGAG TTCCTGGAGG GTGTTGGC TG TTTTTGTTAG C T C AGTTTTT 13 HO 

TTCTG3GCTC CACCATTCCT AACTCCAGGT AGACAAGATA GATGTCACAC ACAACAATTT 1440 



1500 



TAAAGTATTT TG2TTAGTGC ATTTTGTTTA TGATTGCAGT GTTTGTTTCT TATTTAATAG 
'3CTTTTTACT TCATTCTATT AAATTTTAGT GTTTAGAAGA GGCG3GTACT GTCACTGTGT 1560 
AAAATATGTA ATATTTTATA TGTTATACCA T3TCATATAT ACTTGCAATA TCAGA 3CTTG 162 0 

40 CATTCAATAT ACAATGCAAT TGACTCTTTG CAGACCTGCA TTTTTCA3TG AACAATAAAA 
AGATTGTCTG GCACTCCAAA AAAAAAAAAA AAAAAAAAAA 



1680 

1720 
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OTGGAATGAC AGACACTCCT AGAGGAGAAT GCTGTTOAAG G AAC AG AAC G TACTCTTGGA 18 0 

TTAAATATAG CACCTTTTAT TAACCAGTTT CAGGTACCTA TAG GTGTATT TTTGGACCTA 24 0 

5 TCGTCATTGC CCTGTATACC TTTAA'GCAAG CCAGTGGAAC TGTTAAGACT AGATTTAATG 3 0 0 

ACTCCGTATT TGAACACCTC TAACAGAGAA GTAAAGGTAT ACGTTTGTNA AATCTGGGAA 3 60 

GACTTGACTG CTATTCCATT TTGGGTATCA TATGTACCTT GATGAAGANG ATTAGGTTGG 42 0 

10 

GATACTTCAA GTGAAGCCTC C < 2ACT GGAAA CAAGCTGCAG TTGTTTTAGA TAATCCCATG 48 0 

CAGGTTGAAA TGGGAGAG^GA ACTTGTACTC AGCATTCAGC ATCACAAAAG CAATGT 2AGC 540 

15 ATGACAGTAA AGC AATGAAG A^3CAGTTTTC CAATGAAAAC TGTGTAAATA GAGCATCAAC 600 



AAGTACAAAA TTCTTGIXTTT AATTAGTGGG OGTATATAAA AATTCCTTGT AATGGTCAAA 

tattttttaa aattgacatt aataaagcat attttaaaag tttct 

20 



25 



(2) INFORMATION FOP. SEC ID NO: 135: 



( i ) SEQUENCE CHAFACTER I ST ICS : 

(A; LENGTH: 3 23 base pairs 
(B) TYPE . nucleic acid 
<C) STRAND EDNESS : double 
30 ( D ) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 c »: 

AGCACACACC TCCTTTAGTT GCTCCTAAGG TCATGTTCAA CATTCGTGGA GTGCATTTTC 60 

35 

TGCTCA'^G-GA CCTTTCCCAG ACCSGGAATG TTTGGTGCTC ACAGACYCTG GCAAGGATCG 12 0 

GTATTGCTGT TCCTCA3TTT TGOCTGGGGA AATGGAGGST CAGTGACGTT CAGTGACGTG 180 

40 CCCAGAGTCA TGOGATTGGC GGGTGGCCCA GKGMTCCAGG TCTCCAGCAC CCCTCGGCCC 2 40 

CCTCCTCACC A'GGTCACATC ATCTCCTGGA TTAGAATCTG CTCACATAGT CTGTCCTGAA 3 00 

A'GGAAAAAAA AAAAAAAAAA AAC 3 23 

45 



(2) INFORMATION FOR SEQ ID NO: 136: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 58 2 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
55 { D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 6: 

GGACGGAATG GT3CAACCCT CCTWAKTTTT CTKGK.GCTGT TGACAACAGA GGGAGGGAGG 60 

60 
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387 

G AAAAC ATTT TTYGTCX5GAG AATCCTACYT GTGGAGSGGA G T C CTT AA 1 GC GATEGATTTT 12 0 

gaatctkga: CCTTTACGAA CTAMTTTGA AGGAAGATAC CTTGGAAATA TTTGGCATTC IRQ 

agtggctta: t^aaacagca ttagtgaatt catctagaga actctttoat TTATTCAGGC 240 

AACAAGTGTA CAACTTGGAA ACCTT-3TTA- AGTCCAGTTG TGATTTTGGG AARGTATCAA 3 00 

GTCTACAGTG CAAAGCAGAC AATATTAG3G AGIAGTGTGT ACTATTT ITC CATTATGTTA 3 60 

AAGTTTTGA T CTTCA3GTAT CTGAAAGTA Z AGAATGGTGA GAGTGAT3TT CCTGTCCATC 42 0 

CTTATGAGJC TTTGG AG OCT CAGCTTCCCT CA ^TGTTGAT TGATCAG2TT G ATO 3 ATT AC 4 80 

15 TCTTGTAT-.T TGGACACCTA TGTCAACTTC CCAGTGTTAA TATAGGAGCA TTTGTAAATC 54 0 

AAAACCA GAT TAA3GTTTGA CTGGTTTCAT TT 3ATTTTTA AG 5 32 



10 



(2) INFORMATION FC F. GE<0 ID WO: 137: 



ti) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH; 10 21 base pairs 

(B) TYPE: nuclei: aci4 
(G) STRANDEDNE3S : double 
(D) TOPODOGY : linear 

30 (xi) SEQUENCE DESCRIPTION: GEQ ID NO: 13" 7 : 

TTCGG 2 AG AG CCCTTGCGCG C TCTTGAAT A CCTGCKTTITT GT AGO GCT AG TTCTCTTCAA 6 0 

GATTTGCTTA GTCTGATTTC ATTTCGGTTT CTTTTCTCG: CATGTTTTTC TGTCGGAATT 120 

35 

acq jTtcgtt ttg3ttctat gtagtctcta aaatgttatc ( 3TTTTTC A r rr tgtctactaa 18 0 

TTTTCGTOGA TTTGTTACTA CTG AGTTTOT TAATATCTGA GTGGCCTGCG CCCACG7GCT 2 40 

40 CTGOAGANCA TAAAATACTC AGGCTGATGG TAGTGCAGA3 ACTCTCCCTC CTTGATCAGC 300 

'OCAAACGTTG OT7TGAGGGT TGAOG3 AT ZG AG^XACATTT TCTTGG7TGT GTGAAGCGG3 360 

r-??-^\ TV c ^OGAGAGGIG GCX;CACA30 GO :AG0CTCC ACCTATTGTG AGTTCAGAAG 420 

45 

ATCCTOOGIG GT^G-ICTCTT C2TTTGTATC CAGTAGTAGG AG AG 1 1 A. CT C A CTGJACAGCT 480 

GTGATTTGGG ACTGGTTTCG AG3CCTTGGT G3CGG2TGCC C3GAGT-2TAC TGG2AAAA2G 54 0 

50 GACTCT Z'TC Z TG3AGTCCAG AG:ACCTTGG AACCAAGTAC ACCGAA 3CCC ACT3AGTT3A 600 

GTTC^GGGGGC gagagagaag CACCAAGAEG cacccctaga akargtgggg gagggagaro 650 



6( ) :g -g— :g v: • - G'c :r g- 
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10 



20 



^88 



AGCCTTAGAT AGCAGCAGAA OGCTTTTTGG ATTCTCCTCC TTO AAAAG AT TCTCAGTTAC 960 
CAAACGTCTC CACCTAGAAA ATAAAAATAC ATTAAGATGT TGAIIAAAAAA AAANAAAAAA 102 0 

A 1021 



(2) I INFORMATION FOR SEQ ID NO: 13 8: 



( : } SEQUENCE. CHARACTERISTICS : 

(A) LENGTH: 1777 base pair- 
15 (B) TYPE: nucleic acid 

{C) 5TRANDEDNESS : double 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 13 3: 
CGGAAGATGA TaOTCMO AGATCCATTC ATGAAGTGAT ACTAAAAAAT ATTACTTGGT GO 

ATTCAGAACG AGTTTTAACT GAAATCTCCT TGGGGAGTCT CCT'GATCCTG GTGGTAATAA 12 0 

25 GAACCATTCA AT ACAA 1 2ATG ACTAGGACAC GAGACAAGTA CCTTCACACA AATTGTTTGG 18 0 

CAGCTTTAGC AAATATGTCG GCACAGTTTC GTTCTCTCCA TCAGTATGCT GCCCAGAGGA 2 40 

TCATCAGTTT ATTTTCTTTG CTGTCTAAAA AACACAACAA AGTTCT'SGAA CAACXTCACAC 300 

30 

AGTCCTTGAG AGGTTC-3CT3 AGTTC T AATG ATGTTC CTCT ACC AG ATT AT GCACAAGACC 3 60 

TAAATGTCAT TGAAGAAGTG ATTCGAATGA TGTTAGAGAT CATCAACTCC TG02TGACAA 42 0 

35 ATTCCCTTCA CCACAACOCA AACTTGGTAT ACGCCCT07T TT AC AAAC GC GATCTCTTTG 4 £ 0 

AACAATTTCG AACTCATCCT TCATTTfTAGG ATATAATGCA AAATATTGAT CTGGTGATCT 54 0 

CCTTCTTTAG CTCAAGGTTG CTGCAA2CTG GGAGCTGA3C TGTCAGT3GA ACGGGTCCTG 6 00 

40 

GAAATCATTA AGCAAGGCGT CGTTGCGCTG CCCAAA'GACA GACTGAAGAA ATTTCCAGAA 6 CO 

TTGAAATTCA AAT ATGT 3G A AGAGG A'GC AG CCCGAGGAGT TTTTTATCCC CT AT 3TCTGG 72 0 

45 TCTCTTGTCT ACAACTCA3C AGTCGGCCTG TACTGGAATC CACAGGACAT CCAGCTGTTC 7 80 
ACCATGGATT CCGACTGAGG GCAGGATGCT CTCCCACCCG GACCCCTCCA GCCAAGCAGC 84 0 

OTTTCAAGTT CTTTTATTTC TGQ3T AAC AG AAGTAG AC AG ACAOGTTACT T3GTGTATCT 900 

50 

TCTGTTAAAG AGGATTGCAC GAGTGTGTTT TCCTCACACA CTTTGATTTG GAGAATTGGT 960 

G2TAGTTGGC AATAGATAAC TCAGC GTAG A TAG T ATTGCA AAAAGGGGAG GAAATACACA 1020 

55 ACAATAATAA ATGTAAAAAC CTGCTATTCA AC ATGC AGTT TTATTTCGAR GCCAAAAATC 1080 
TAGAGCTTTC CCAAGATCCT GTT'jCCTTAG GCACATNCAC ACTTCAACAG TGCACACTAT ~ 114 0" 

CCAACAGTGC ACACTATTCA ACAGT'GCAC A CTATTCAAAA GCGTAGACTA TTTTTTTGCA 1200 

60 



10 



25 



35 



45 
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(2) II-JFOEMATICN FOR SEQ ID NO: 13 9: 



( i ) SEOI.rENCE CHARACTERISTICS ■ 

(A) LENGTH : 64 3 base pairs 

(B) TYPE: nucleic acid 

{ C ) STFAMDEDNESS : doub 1 e 
30 { D) TOFOLOGY: linear 

{ x i ) SE 2/JENCE DESCRIPTION : S EQ ID NO: 13 9: 

TTT™rTTT r T TT r ' 1 Tr r rTTTT TTTTTITTTT TTTTTTTGG* 3 AAT< jAGAAAA TAACTTTATT 



CG3CAGCCTT GGT-1ACCTTG AGC ACGTT ■ "A AG2G2ACTGT CTTGCTCAGA GCCCGGCACT 
40 CGCCCACTGT GAC3ATGTCA CCGATCTGGA CGTCCCTGAA GCAGGGGGAC AGGTGTACAG 
A2ATGTTCTT GT0GCGCTTO TCCAAGCGGT TGTACTTGCG GATGTAGTOC AGATAGTCTC 



cctcgaatgg ACACATTAcr agtiaagggg catttcttct caatgtaggt gcccctcaat 

AGCCTCCTTG GCGTGTCTTT GAAGCCCAGA CCGATGTTCT TGTTAGTAAC CCGCGGGAOC 
50 TTCTCCTTGC C AGTTTCTC Z CAGCAGGACC CTCTT2TTGT TTTGAAAGAT GGTCGGCTGC 



i n 6 0 



1320 
13 8 0 



TGTTCAAGAT ATTTGTTTTG GT2TTATGTG TGTGTGAGAG AG AG AG ATT C CTTTGACATT 
AAGGAGGATC AATGAGAAAA GATGATGAGG CAGGAATTAA TAAAGAAATG AAGTCGTGTG 

tgtttg::ttg CCTGTCAGAG GGCACACAAT ttcataaaca ccatgcctgg acaatttgat 

ATTAATATTT AACACCTCTG CATCTTTTTC TTAAAAAAGA ATATGGGCCA GATACAGTGG 14 4 0 

CTCACATTTG TAATCCCAGC ACTTTGGGGA GCCAAGTTAG CAGAATCCCT TGAGCACAGG 1500 

AATCTGAAAC CAGCTTGGGC AACATAGTGA GATCCCATCT OTACAAAAAA CTTAAAAATT 1560 

AGCCAGGCAT GATGGCACAT TCCTGTAGTC CTAGCTACTC AGG AO GC T AA G3TAGGAGGA 162 0 

15 TTGCCT'3AGC CCAGGAGTTC AACGCTG2AG TGAGGTAAGN ACGTGCCAGT AOACTCCAG:: 168 0 

C T3 AGO 2AC A A^GT 7' AG AC C CT7JTCTCGCA AAAAJVAAAAI J TTAAAAAGTC GGGGGGGQ0C 17 4 0 

CC3GTA2CCA AATCGCCGGA TATGATCGTA AACAATC I 777 

20 



TTC ATTGTGG GGAGGGGGCC GATCTCCAGC CTCAGAACTT CTO^AACTGC TTCTTGGTGC 12 0 



ISO 
240 

300 



GGCGGAT^AC AATGGT^CTC TOCAT2TTCA TCT1GGGTCA CCACGCCAGA GAGGATCOGC 3 60 



42 0 
4S0 
540 



TTTTGGTAGG CACGCTCAGT CTCAATGTC 



"GMAY TCCTGCAGCC 60 0 



60 
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( i } SEQUENCE CHARACTERISTIC \S : 

fA) LENGTH : 12 20 ba^e pairs 
t B > TY PE : nuc leic a i d 
CO ST RAT IDE Dr JESS : double 
<D> TOPOLOGY: linear 

( x I ) S EQUENC E DE SC R I PT I C N : S EQ I D NO : 14 0: 

GGCACGAOSA TGATAGACCT ACT<3GAGGAA TACATGGTTT ACAGX3AAGCA TACCTACATR 6 0 

AQ3CTTGATG GCTCATCCAA GATCTCGGAG AGGCGAGACA TGGTTGCTGA TTTTCAGAAC 12 0 

AG3AATGACA TCTTTGTGTT CCTGTTAA3C ACACGAGCTG GAG3ACTGGG TATCAATCTC" 1 '0 

15 ACTGCTSMAG ACACAGTGCA TTTTC T ATG A TAGSGACTG^ AACCCCACTG TG3ACCAGCA 240 

GGCCATG3AC AGG3CC3ACC • 3CTT AGGG 2 A < JACAAA3CA 1 !i GTTACTGTGT AGCGGCTCAT 3 00 

C T* 3T AAAGG "~ ACCATTGAAG AACGCATTCT t^CAAAGAG: : AAG3AGAAGA GTGAGATTCA 3 60 

20 

GC<3G AT 3GTG ATTTCA- OGTG GGAACTTCAA AC SAG AT ASS TTGAAACCCA AA3AG jTGGT 4,0 

TAGTGTTCTT CTAGAC3ACG AAGACTTGSA 'JAA3AVv3 2T ATGTACTCTA AASCTCTATA 4 r '0 

25 CACTCCCCTC ACG T ATCTG A GAATGGAAGA G3TACTTGG 3 TGTGTGCCAA C-GGTTAGGCA 54 0 

AAGOSA3AG3 CTGTATTTAG -3GAAAGTATT TTTGTGCTCA TATTTTATAT AAAAACCCAA 60 0 

ACAAGAATGT CTTTGTAG3C CAGGC GTGG T G3CTCGCGCC TCTAGTCTCA GCATTTCGGG 660 

30 

AF-GCCAAAGT G3GCAGATCA CCTGARGTCA G3ARTTTGA3 TTTGARACCA iXTCTQ3CCMA 72 0 

C GTTGT GAAA CCCCACCTCT ACTARGAF TA • 2 S G AAAATTG GTTG3GCATG GTGGC3GGCA 780 

35 CCTGTAATTC CAGCACTTTG GGAGGCTGGG GCAGAANAAT TGCTTGAGCC 1 SAGS A3GTG ^ 84 0 

AGATTGGGGT GAGCCGAGAT YGTGCCATTG CAMTCCAGCC SGGOSAATAA GAGTGAAAYT 900 

CCATCTTTTA AAAACAAACA AAAACAAAAA AC A< 2AAGACG GCTCACAC3T GTAATCCCAG 960 

40 

CACTTTG3GA F.GCCGARGCA G3T03ATCAC GAFGTCAGGA GTTCCAAGAC TAGCCTOGCC 102 0 

AACCTGGTGA AGCCCCGTCT CTACTAAAAA TACMAATATT AGTCGGGCGT GGTGGTGGGC 1030 

45 A CGTGT AATC OGAGCTACTC G3GAG3CTGA 1 3GC AGGAGAA TCCCTTGAAG CTAGGAGGCA 114 0 

GAGGTTGCAG TGAGCCAGGA TCGTGCCATT '3CACTCCAGC CTGSACAACA AGAGGAAGAT 1200 

TCCATCTCAA AAAAAAAAAA 122 0 



50 



55 



(2) INFORMATION FOP. SEQ ID NO: 141: 



( l ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 721 base pairs 
{ B) TYPE : nucleic acid 
( C ) STRANDEDNESS : double 
60 (D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID MC : 141: 
AATTCGGCAC GAGCCAGGTT AGCCGGAAGG GCAG2TCTOC AGGCCCTGCC CACGCCACAG 

5 

GGGGCTCCTT ATGOACA<0OG GG3CGTCTCC TTGTGGCCAT AO AAACGG AA. CT^>CTCTTT 
TCAACAGTGC TGCAAGA3GA TG3ITATTTA ACGCTGGCCC CCAAGGAGGA AAGGCACAGA 
10 CYTTCCTCCC TCCTGGAACA TCCAAGGGCA CTGGATCCTC TCTGTCCCTC TGA3ATGGGG 
TGCCACTCGA GCAAGACiCAC CAO0GTGGCA GCTGAGTCCC AGAAGCTTGA AGAAGAGYGC 
GAGGGAAGAG ACK2CAGGTCT GGA^CCGGC ACCCAGGCAG CAGACTGCAA GGATGCCCCG 



15 



25 



A 



60 
120 

ieo 

240 
300 
3 60 



CTGAAGGATG GAACCCCTGA GCCAAAGAGC TGAAATGCCT CTCTCCAGAG TCGGACCCTG 420 



430 



ACCTCYTTCC TGGAACTGOC TTPSGCCCCA GAACCATGAG ACAATCCCCA CCCTGAGAAG 
20 CTCCGATCAC TGGGAGGAGA GAGAAAGCCT CCAGCTTTGG GATTCAGGCT TCAGAAGT'IT 
TTAGCAGCCT TTGCTCATTG GAGAGGT03G GAAAGGATAA AOTTCTTATA AGGAAATCCC 
TAATTTCCCC CAGCTC 7TGC CCNCCNGAAG AAGGAACNAA AGAAAGTTCC TTCCACACGT 
TTTGTTGGAA ACTTTTCC 2T TGCCAACTTT CCTTGGATTG CCAGAACAAA GCCCTCCAGA 720 

721 



t>4U 

600 
660 



30 



(2) INFORMATION FC-K ^.h,Q ID NO: 142: 

35 { i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 14 63 base pairs 

(B) TYPE: nucleic acid 

(C) STr-ANDEDNESS : double 

(D) TOPOLOGY; linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 142: 
ATCAATTAAT GTTTATAAAT GACTGTACTG AATTTAAAAC CGTACAGTTT CATITGCATT 6 0 

45 ttgacattac tttattatac attttgcatt taaaaogctg caccagttgg c-ttttottot 1..0 
gttttattot caaaatatag agattctgtg at tt atttgc octgt'itaoo xvctaaaaav, :n0 

AAAATT^AA TATAAA&7AT TTCAATAGGA TGCATAGGTA TATTACGTTT TTTAAATGCT 24 0 

TTAGATCTGT GATTCTTGAC TTACTATTTA TTTTATCC 0C TTTAAGTCAG G3ATO_TTTA 
TTCTATTTrA AAGOACTrAT GAG™ACATO TTGT A ATO AA GTTTGCACAA TATATTTATC 



300 
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GAAAAAAAAA TTCTCATTAT TTGCAAAGAA TGAACAAGTT AATGAACAAA C AAAC TAG AT 60 0 

TTGGTATGTT TTCAGCTTTT GTATCATGTT TAATTGTTTA AT-rTOGTT\.;A AAAACTGCAG 66 0 

TTGAGAAATC AGATA< 3CAAT ATAGACATTC ACAGCAGCTC TGTGGATACC ATGTAATTGT 72 0 

CAGGTAATTT CAGAATGTTG AAAATTATTC AGTGC AGC C 1 0 TCATAGTATC ATACTTGAAG 780 

AAATTGATTA CAGTTGCACT AAATTGTTGA AGATAAATTA TT1TTAAAGG TTATGAAAAC 84 ( 

T AAGTT AT AT T AATTC AT AT GTTTGATTTT TAAATCCCAG CTCCTCAA'GC TATCGAATTT 90 0 

NCTGACTTTG AAAATAACCA TGAGAGATGC CACATTTCTC TCTGGGAAAC TACCACTCAA 96 0 

15 AGAATAATTG TTAAAAATTA AGCTTTTAGG T ATT AG AAG 3 TGTTATAAAG TATAAAATTA 102 0 

AGATATAAGC AGATCACATG TAAATCATTC CTAAAGCACA AGAAAAOtAAT GTGCCTTGAT 108 0 

GT AC AT AT AT TACTAAGTTG CCTCTCCCAG TTTACTTTAA AAATGG 3TTT AAGGATAAAG 114 0 

20 

AATAAATGT3 ATAGCTGTGC ATGCATTATA TATTTGCATT TGCAJVATTTC OCATTGTTTT 120 0 

AAC AG" TG T O TOGCTGACTT TCAATTTTAA GACGTGAATT GAC AT ACAGC CG AT AACTTT 126 0 

25 ATAATGQ 2 TG CTCATTTATC TTATCTTTCA GTTAGTGGAA AAACATTTCA ACSTGACTAA 132 0 

AATTTGGAAT TGTGTCTTTT ATGTTCCATC CTCTGTTGTT ACTAGATTTA GTTTAAAAAT 1330 

TGTGTATGA2 CATTAATGTA TGTCATAAAC ATGTAAATAA AAGATGTTGA ATCTTGTTGA 144 0 

AAAGCAWRAA AAAAAAAAAA AAACTCGA 14 6 8 



30 



35 



45 



60 



(2) INFORMATION FOR SEQ ID NO: 143: 



( l ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3 00 base pairs 
40 (3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 
TGAATTTTTT GCCAAACTTA OTAACTCTGT TAAATATTTG GAGGATTTAA AG AAC ATCC C 60 

AGTTTGAATT CATTTCAAAC TTTTTAAATT TTTTTGTACT ATGTTTGGTT TTATTTTCCT 12 0 

50 TCTGTTAATC TTTTGTATTC RCTTATGCTC TCGTACATTG AGTACTTTTA TTCCAAAACT 180 

AGTGGGTTTT CTCTACTGGA AATTTTCAAT AAACCTGTCA TTATTGCTTA CTTTGATTAA 2 40 

AAAAAAAAAA AAAAAAAAAA AAACCCCNAG GGGGGGGCCG GGTNCC CAAT CCCCCCCAAA 3 00 

55 



(2) INFORMATION FOR SEQ ID NO: 144: 
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(i) SEOUENCE CHARACTERISTICS: 

!A) LENGTH : 224 3 base pairs 
( P ) TYPE: nucleic acid 
< C ) STRAT-JC EDMESS doubJ e 
■. C ) T .:■ POLOS Y : linear 

(XL) SEC'CENCE DESCRIPTION: SEQ ID N'v 144: 

T3CCTCCCTT CCTCCAGATT GTGGACAGTA GTTCCTCAG2 CTGCACCCTG GATTC C TTCT 60 

TCCCCTT2CT AGO! CCATOG GACTCGGCCC AAGA2TGTGC CTTCAA3GAC CACCA3CCCC 120 

TTACTCTTCA AGCCCTGACT GTG3AGTTGG TAGATGSCT7 TGATCCTCAG TATT2T7TCT 13 0 

15 G3CAATGTTC CAC7-GCTTCT ■ S CTTC C TGGG AGCT3G7TCC ATAACTTGAT TTT7CC7AAA 24 0 

CGTGTTGCAA TC77TG2TG2 C2CTTAGCCA CCCAGGGTCT T3TGT3G3TA TGAGT STAGA - 300 

G3ATG3GGGT ATG7CA3G2C TGGGCCGTCC CAGGCAGGCC CGC'IGGACCC TG ATG7 T ACT 36m 

C 2TATCCACT GC C ATG' FA 2G GT3CCCATGC CCCATTGCT 7 GCACTGT02C ATGTG3ACGG 4 2(- 

CCGAGTGCCC TTYCGGCCCT CCTCAGCCGT ' 7/3 TGCTG AC ' F GAGCTGACCA A<3CT A 2 T 3TT 4 3U 

25 ATGCOSCTTC TCCCTTCTGG TAGGCTGGCA AGCATGGCCC CAG3GGCC2C CACCCTGGCG 54 0 

CCAG3CTGCT CCCTTCGCAC TATCAGCCCT G2TCTATGG2 3CTAACAACA ACCTGGTGAT 600 

CTATCTTCAG CGTTACAT-7G ACCCCAGCAC CTACCAGGTG CTGAGTAATC TCAAGATTGG 66 0 

30 

AAG2ACAGCT GTGCTCTACT G7CTCTGCCT CCGGCACCC-: CTCTCTGTGC GTCASGGGTT 720 

AGCGCTGCTG CT7C~~AT0G CT-3CGGGAGC CTGCTATGCA GCAG3GGGCC TTCAAGTTCC 7 30 

35 CGGGAACACC CTTYCCAGTC CCCCTCCAGC AGCTGCTGCC AG2CCCAT3C C C ■ 2 TGC AT AT 840 

CACTCCGCTA CX>7CT3CTGC TCCTCATTCT GTACTGCC1C ATCTCAGG7T TGTC3TOAGT 900 

GTAC AC AG AG CTG~*TCATGA AGCGACAGNG GCTGCCCCTG GCACTTCAGA AC2TCTTCCT 960 

40 

CTAC ACTTTT 'SGTGTGCTTC TGAATCTAGG TCTGCATGCT GGCGGCGG2T CTGG7CCA3G 1020 

SCTC2TXAAA GGTTT7TCAG GATGGGCAGC ACT7GTGGT3 CTGAGCCAGG CACTAAAT 3G 1080 

45 ACTOCTGATG TCTGC7GTCA TGAAGC ATO 3 CAG^GSAIC ACACCACCTCT ~I^TGG'?GT7 1140 

CTGCTCGCTG GTGGTCAACG CCGTG2TCTC AGCASTCC'I G CTA2GGCTCC AG:TCACAGC 12L-0 

CGCCTTCTTC CTGG7CACAT TG7TCATTG3 CCTG3CCATG CGCCTGTACT ATGG2AGCC3 1J6 0 

CTAGTCCCTG ACAA7TTCCA CCCT3ATTCC GGACCCTGTA GATTGGGCGC CACOYACCAGA 13 2 0 

TCCCCCTCCC AG3C7TTCCT CCCTCTCCCA TCA 0CAO2CC TGTAACAAGT CAS C TTGTG AG UB0 



50 



60 
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CCCATCCTTG TOI^GGCAGCT CCCTGCTTTG TC C TGC AT G A AC AG AG TTG A TGAAAGTGGO 162C 

GTGTGGGCA-\ CAAGTGGCTT TCCTTGCCTA CTTTAGTCAC CCAGCAGAGC CACTGGAGCT 163C 

<3GCTAGTCCA (3COCAGCCAT (3GTGCATGAC TCTTCCATAA G^SATCCTCA CCCTTCCACT 1740 

TTCATGCAAG AAG3CCCAGT TG2CACAGAT TATACAACCA TTACCCAAAC CACTCTGA2A 13C0 

i'.iT':TCCT:CA GTTCCAGCAA TGl'CTAGAGA CATGCTCCCT GCCCTCTC LA. CAGTGCTGCT 1360 

CCCCACAGCT AG 2 2TTTGTT CT33AAACCC CAGAGA03GC T333CTTGAC TCATCTCA3G 192 0 

1 3 AATGT A 3C • Z CCT3Q3CCCT GG2TTAAG2C GACACTCCTG ACGTCTCTGT TCACCGTGAG 19S0 

15 GGGTGTCTTG AA3CCCGCTA CCCACTCTGA '3GCTCCTA3G AG3TACCATG CTTCCCACTC 2 040 

TG3GGCCTGC CCCTOCCTAG CAGTCTCC 3A '3CTCCCAACA G2CTOGGGAA G2TCTG2ACA 2100 

GAGTCJACCTG AGACCAGGTA CAG3AAACCT GTAGCT2AAT CAGTGTCTCT WTAACT3CAT 2160 

20 

AAGCAATAAG ATCTTAATAA AGTCTTCTAG -3CTGTA'3G3T G<3TTCCTACA ACCACAGCCA 222 0 

AAAAAAAAAA AAAAAAACTC GAG 2 24> 

25 

(2) INFORMATION FOR 3EQ ID NO: 145: 

30 (i) SEQUENCE CHARACTERISTICS: 

'A) LENGTH: 1082 base pairs 
(B) TYPE: nucleic acid 
JC> 5TRANDEDNESS : double 
(Z) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

GCCAAGCTCT AATACGACTC A:2TATAGGGA AAGCTGGTAG GCCTGCAGKT ACCGGTTCCG 60 

40 GGAATTCCCG GGTCGACCCA C3CGTCCGCT TC CGTGTGTC AAAATCCTCA CCTCCTTCAT 12 0 

AACCATCTCC CACAATTAAT TCTTGACTAT ATAAATTTAT GGTTTGATAA TATTATCAAT 130 

TTGTAATCAA TTGAGATTTC T TT AGTGCTT (3CTTTTCTGT GAGTCAA 3TC CCCAGACACC 240 

45 

TCAT P 3T AC T TGAAAACTGG AACANCTTGG GAATGC CATG GG3TTTGATA ATCTGCCAGG 300 

GACATGAAGA GGCTC AGCTT CCTGGGACCA TGACTTTQ3C TCAGCTGATC CTGMACATGG 3 60 

50 GAGAACAACC ACATTTTTCT TTGTGTGT3C TTCTAGCA3C TGTTCGGGAG '3ACCKTGACC 420 

CAAYAGTGTT CCCATGCTGT TTCTTGTGAA ATGCTCTCGG CTATGTAGCA GC TTTT<3ATT 430 

CCCT3CATAC CCTA3GCTGC T 1 3CCCCTATC CTGTCCCTT3 TTTATAACAT TGAGAGGTTT 540 

55 

TCTA3GGCAC AT ACTG AG TG AGAGCAGTGT TGAGAAGTCG GGGAAAATG3 TGACTACTTT 600 

TAGAGCAAGG CTGGGCATC A GCACCTGTCC AGCTCTACTT GTGTGATGTT TC AGGAACTC 650 

60 A'3CCCCTTTT TCTGCCTAGG ATAAGGAGCT GAAAGATTAA CTTGGATCTY CTAATGGTCC 72 0 
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AAATCTTTTG GTCACAATAA AGAGTCTCCA AATTAGAGAC TGCATGTTAG TTOT^ATGG 
ATTrGGT^ CTGACATGAT ACCCTGCCAG CTGTGAGGGG ACC 3CG1TTT TAAGATOOAT 

5 

CGCCAAGGTC TCTGCAAATG GA^ATGCTTA CACTCGGTGT TGG^GATOTT TC^ACCTCC 900 
TGCTATTTIT GTGGTTTTG3 TTCTCCCACT ATGGTA3GAC CO 3TGGCCAG CACCGTGG3T 
10 TGTCAT3TCA GOCCCATTGA CTACCTTTTC AT3CTCTGAG GTAGTACTGO CT* 3TOC AG2A 
CAAATTTCTA TTTCTGTC AA TAAAAOGAGA TG AAA AT AAA AAANAAAAAA AAAAAACT 0G 
NG 

15 



10 



30 



40 



50 



(2) INT'OF>IATICN FOR SEC IE NO- 146: 



ACAGT3TACG 3:3ATG AACTG GAGTGTC-CGG CCOGATAAGC G0TTTC3CT1 



0 



960 
1020 
1080 
10H2 



( i ) 3H0CENCF. CHARACTERISTICS : 

( A ) L ENG' I'H : 4 3 13 base pa i r s 
(E) TYPE: nu-l*?ic acid 
( G ) 5 TRANDET NEGS : doub 1 e 
25 (D) T0TOLO3Y : linear 

i> :1 ) 0EQUENG Z DESCRIPTION: SEQ ID NO: 146: 

CAAGCTGGGT TGAAACTAC5G GGT C GGO 3 TG GGCCGTCGTC GTTGTTT 3TG GCCGCATCCC 60 

cgcttccggg ttaggccgtt cctgoccgcc ccctcctct: CTCCCTTCGG AGGGATAGAT 12 0 

CTCAG3CTCG OCTCCCCGCC COOCC-CAGCO CACTGTTGAC CCG3CCCGCA GTGCGGOGCG 130 
35 GTGGCCACCA TGTCCCTG3A CGO2AAACQ0 AAG3AGATCT ACAAGTATGA A(3G GC CCTGG 



40 



GCTG3GC 3 00 



AGCTTCGTGG A : GG AG T AC ' AA 3AACAAGGTT CAGCTTGTTG GTTTAGATGA G3AGAGTTCA .'"-6 0 

GAGTTTATTT CjCAGAAAC AC CTTTGACCAC CGATAC3CCA CCACAAAGCT "ATGTGGATC 42 0 

G CTG AC AC AA AAGGCGTCTA TCCAGACCTA CTGGCAACAA GCGGTGACTA TCTCC G TGTG 4 3 0 

45 TOG A GOG TT V 3 oTGAGGGAGA GAGOACOCTG GAGTGTTTGC TAAACAATAA TAAGAACTGT '40 

GATTTCIGTG CTG 2CCTGAO CTCOTTTGAC TGGAATGAGG T.^Gm. . C^.l a A r\.- - - - - Ai.A_- ± -■ -< ^ 

AC C TCAAGC A TTGATACGAC ATOCAO0AT3 TGOGGGCTGG AGACAGGGCA GGTGTTAGGG 660 
CGAGTGAATC TCGTGTCTGG CCACGT3AAG ACCCAGCTGA TCGCCCATGA CAAAG AGGTC 
TATGATATTG C ATTT AGO CG GGCCGOGGGT GGCAGGGACA TGTTTGCCTC TGTGGGTGCT 



2 0 



60 
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CTGT5GCCAG GTTAAACAAC CATCGAGCAT GTGTCAAT'iG CATTGCTTGG GC2CCACATT 102 0 

CATCCTG2CA CATCTGCACT GCA(3CGGATG ACCACCAGGC TCTCATCTGG GACAT02AGC 1030 

AAATGCCCCG AGCCATTGAG GACCCTATCC TGGCCTACAC AG2TGNAA2G WCA2ATCAAC 1140 

AATGTGCAGT GG02AT2AAC TCAGCCCGAA YT Z-TCG2CAT PTGCTACAAC AACTGCCTGG 1200 

AG AT ACT 2AG AGTGTAGTGT TGGTZGCGCT CT-3CCCACGA GGCAGGGGCT TTTGTATTTC 1260 

CTJCCTCTGC CCCACCCCCA AAGTAA3AAG AAACATG 2TT CCAGTGGCCA GTATCTCTTT 1320 

CATTGCTTTG CACCCAGTGT TAZCAGAAGC TGCTCTA3GA GTTCCTGG7C AGTCACCCCA 1380 

15 TCGCCCTCTG T< G G2 AG AC T C AGTG2TGTGT GGCGCZTC2T CAGCCCAGCG CTGA2TTTTA 1440 

AGATTTTCTC TCGTTTCCTC TTCTCCTTTG GTTCCTCAAT TAAAAAATGT GTGTATATTT 1500 

GTTTGTCAGG CGTTGTGTTG AGGAGCAGTT CACGCA2T CTGTGTCTAT TCCTCTGCCC 1560 

20 

AGGTGTCTCT GTTT< "XTTGC C CAAKGYWKKT TTTCATGT ZT CGTCCATGTC CATGTTCGTG 1*520 

TTAGCACTV.'A CGTG3GAACA AATACCAATT TGTCTTTT:T CCTAGTATCA C-TGTGTTTAA 1^3 0 

25 CAAATTTTAA CTTTGTATAT TTGTTATCTA TCAGGCTAAT TTTTTTATGA AAAGAA.TTTT 1/4 0 

ACTCTCCTCX; TTCATTTCTT TGTCTTATAG TCCTCCCTCT TTGCACCTTO TTCTGTTCCC 1-300 

TCAGTGCCTG GA3CTGGTAC T<3GGCCCCTG GCCCCATGAG 7AGTTTCC CT TCTTGAGTCA 1360 

30 

CTGCCTGTGT ACT AC AT AC C TGACCGGGAG TC C AAA. C C AC CTTGGTZXZTO TGAAVTCCAC 192 0 

TGACTCATCA CACCTTTCTT AGCCTGGCTC CTCTCAAG3G a.TTCTG;X"^ TTOTAAACAG 198 0 

35 A 1 2 AT AGG AAG C CTCTGTTT A CCCTGAAGCA CCACTGTCCA GCCCATTOGr TCCCACTGGC 204 0 

AC 'iC ATGGT AG ACJCTGAGAGA AACAGZCTCT CAGGCTACCT GACTTGACSGG GAATCGTTTC 2100 

ATGAAGCTGA ACTTCAAGCA TATTTCCAGT ACATTCTTTC AGAGTCTGTT TTT C C ATC C A 2160 

40 

AAT AT AAGC C CCA024CCATT CCACTTAGTG TCTTTTCAAT GATAGGCAAG AATCIATATCT 222 0 

GAGTTGAACT TCGGTGCTTC TGTTGTTTGA GTTTACTGTG CCTGGTGGTA TATT^GGGCAT 2 280 

45 TCTTTGGATT GAGTGTTCTG AGGTGAGAGA GTCTTCCCGA GGCATCCTGT CTGTGCTTCC 2 34 0 

AAC C CTGAAC AAG AC CTT AC ATGAGAGATG (3ACTGATGGA CTGCGGCAAT CCTGGGCTGT 2 400 

CAAGTGGATA GATAGTTAAA AAGCATTATA CTGTGGGTAA TGAAAAGGGA G3AAAAAAAA 2 460 

50 

AGAAGGAAAA GGAATTATAG ACCCCCAGGG TCAGCCAGTT AAG AGCT 1 IT A CCCACACCTG 2 52 0 

TCAACCCCTC TCTCCCCCAG TTT AGGTT 2 T GAGCAGTATT ' 3G ACTTGT AG C 2TQ2AGTTG 2 58 0 

55 T'TTTTTGACT TGCAGGCCGC AGTGTCTTTC TGTTATGTGA ATGAGTTZCA T03AGGGGCA 2 640 
TATGTGTGAT TCCACCGTTA GAT3AGCCCT T3GGG3A3GC AGTTT03GAT GT< G 2TCTTGG " 27 00- 

G3GAAAGTTG GCTGTTTCCT TGCGCTCTGC TCCTACCCGA AGTTTTTAAG TCCCTCTGAA 2760 



60 
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397 



10 



20 



40 



TTGCTCATCT GAG ATT ACTA GAGTA3CAGG v.Ci^ Jrt . ^ i>AJ -"° 1L - i ^ 11 ^ " 



2880 
2940 
3000 
3060 
312 0 
3180 
324 0 
33CO 
3 3e0 
3 42 0 
34*0 
3 54 0 



TTCTCACCTG C TTGAG AAGT AAAAC AGT AA CTTTGTTCTT CTGGGCCCTT AAG:TTTTTT 
GGTTAAGT 2T TCCTTTT 3AG AAGT AG AT GT CATT AT ATGC CAAAAGTCTA GCT 2TTTGCT 
TT AG CAT A 2 A OGGACCT3TC CCAAA 3AAAA AGGCTCTTTT TTT AGC CAGC ATATTTCCCC 
TTCTACCCTT TT ACTTTG TT GTTCTGATTT TAGGACTCTG GCTGGCCATG TGCTTGTO3T 
TGCCTCTCCT GCATTTGC 3 A CTGGATTTGC ACTGCATCGT TTGG AG AT AC AAACXTGA3CA 
GTTCTTGGTC AGAACCCT3C TCTGCTTTTC ATT3T3TTTG AT AATCCTT A CTOXTTCCTT 
15 C TCTCAAGGG TA3CAAOG2C AAGCTGATG3 CTGCTTGTTT AGGAG03CAT CAC;TTCCTTC 

ctgtggagaa gggtctcaaa tggaagtcag tggtagaagg gcctggtctg CTOOGCAGGG 

CTTACATCCA CTGAGTTCTA AGATT02TTT CCTGATCTGC ACCTACC JCT GGTCTGTATG 
GTGGAATTTG TCA3CT;GAA CTCAGAAACA ACAACTTGAA aaaaaaataa taattagaac 
at atttgc at aagatagcta TTTACTCTGG aaaccaacaa cttttgaoat ttcccttgcc 

25 CTGTGGACGC CCAGCTCCTG TCATCCTTC' 2 TTAGGTCCTG CAGTACA 3TC TTCCCCTGAA 

TGCCACC 303 GACCCAOOGG GACTCCACCC CCCTAAGCAA GC A\ CY2 AC AT ACTCA 0AGTT 3600 
GATGAGTTGC TGG TC TT" 2 3 A GT2CCAGCTC T3TTACCCTC CCTTTACTCC AC2AGCCCGA 3 600 

30 

CCiACCCATGA CTGAG3- A< >3G GATTTCTACA GTCTCAGGAT TTA3AAA3TC TGTAAGCCAT 3720 

, _. T - , ,r,"-Tv^^TA r-T^aa^a fAA^C^TA ATTTGTTX3AG 

CCATGCTLCA UAAAor-. A'-^'j rti'-iv^j.^.^ ^ - ■ 

35 GTTCTCAAAC TGACAGCCAG CGACACTG3G TGGGAGGCCC T G 3 AT < Z TGTT CTCCCTGACT 

Q3G3GAGGAG CAGCCACTAG GACTTTAGCA G3AAGCCCAC ATG0AG3CTC C02CAGQ2TG 

TGGCCCAGCT 'GGTGATGGCC CTTTTGCTCC TGGCAGCCTG AGCCACAGCT GC 2T3TATTG 

TCCTCATCTG TTCTGACTGA AGGATGGAGG T3CTGAATAA ATTAGG7CTC AG3CI rTCTAC 4 02 0 

CACCAGAGAG CT3GAGAATG ^TCCACGTC ATTCAAGGAC CTGAMTT^- TATGCTCAGG 4080 

45 AGC A 2T 1 3G AA TC2'L-JTTCTT CCA 3GGAG3A ATT AGCCTGC AAC X 1 ' rT AG' i A CTTGAAGAOG 

CAAGGTATTT AATAA2TG0G 0GA3GATG3G TGTGGTGCCT CACA2CTGTA AT 22 ."AGCAT 

TTTG 3G AGO 3 TGAGGT< 3C 1 3 AGATCCCAAG GTCAGAAGAT CGAGA2CATC CTGG2TAACA 

TGGTGAAACC CCATCTCTAC TAAAAATACA AAATTAAATT GGC2GGGCGT GAA 



3780 
3840 
3?00 
3960 



50 



4 140 

4260 
4313 



60 
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398 

( C ) STRANDED* 7ESS : dour le 
(D) TOPOLOGY: linear 

l x l ) S EQUENCE DES Z R I PT I CM : SZ 2 ID MO : 147: 

5 

GGaGAGAGCCT CAAGCTGACT TO j ATTATGT GGTCCCTJAA AT C T A COG A C ACATGCAGGA 60 

GGAGTTCCGG GGCCGGTTAG AGA3GACCAA ATCTCAG^ jT CCCCTGACTG TOGCTGCTTA 12 0 

10 TCAXVA'OGGS AGTGTCTACT CAGOTGOTAT GGTCACAGCC CTCACCCTGT TOXTCTTCCC 18 0 

ACTTCTO^TG TTGCATG03G AGCG7ATCAG OCTTGTGTTC CTGCTTCT* jT TTCTC-CAGAG 240 

cttoottctc ctacatctgg ttg7to:tog gat ago goto ac:ag:octg gtccttttag 300 

15 

TGTGOCATG3 gao:;gagtct o^gtttggsc got :atgggc acacaovoot tgtactcca: 3 60 

ao-gccacgag cctgtctttg gagcgatoga ttgg-::atgca G-OTroGTG j gattcccaga 42 0 

20 GGGTCATGGC TCCTGTACTT G^CTGCCTOC TTTGCTAGTG G3AOC3AA3A C CTTTGCCTO 480 

CCACCTCCTC TTT3CAGTAG GTTGCCCACT GOT "TGCTC TGO:CTITCC TGTGTGAOAG 54 0 

T3AAO-GGCTG CGGAAGAGAC AGCA^CCCCC AGG3AATGAA GCTGATGCCA GAGTCAGAO: 600 

25 

OGAGGAGO\A GAGGAGCCAC ' V GAT G 3 AG AT <3CG<3CTC 03G GAT3C 3CCTC AGTAOTTOTA 66!) 

t\>3agg\gtg ctocagctg3 3Cgtgaagta cctctttat:: ■ z tt 3 3T ait- " agattctog: 720 

30 i:tctq: gttg gcax:t:oa tccttcgcag gcatctcats gtgtggaaa3 t3tttgcccg 780 

taactt3ata tttgagoctg tgogcttcat tgtgagca3c gtgggacttc tcctgggcat 34 0 

a3ctttggtg atca3agtgg atggtg3tgt '3a3ctcctg3 ttgag3cagc tatttctggc 90 0 

35 

C3AGCA3A3G TAGC3TAGTC TGT GATTA GT ■GGDACTTGGC TAG AG AGA 3T GCTQGAGAAC 950 

AGTGTAGC 7T GGCCTGTACA GGTACTGGAT GATCTGCAAG ACAGGCTCAG C3ATACTCTT 102 0 

40 ACTATC AT 1 3C AGCDAGGGGC CGCTGACATC TANGACTTCA TTATTCWATR ATTCAGGACC 1080 

ACAGTGGAGT ATGATCCCTA ACTCCTGATT TGGATGCATC TGAGGGACAA GG3GGKCGGT 114 0 

STCCGAAGTG GAATAAAATA GGCGGGCGTG GTGACTTGCA CCT 118 3 

45 



50 



(2) INFORMATION FOR SEC; ID NO: 148: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH. 734 base pairs 
; B) TYPE : nucleic acid 
(C) STRAHDEDNESS : double 
55 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 

G AATTC GGC A GAGTGAAGCA TTAGAATGAT TCCAACACTG CTCTTCTGCA CCATGAGACC 60 

60 



BNSDOCID --WO 
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10 



20 



25 



CCCAGGGC AAGATCCCAT CCCATCACAT CAGCCTACCT CCCTCCTGGC TGCTG 3GC AK 12C 



15 TTCTGAQX.T CACCACTGCC ARC CTCAG3C AACATAGAGA G2CTO.:TGTT CTTTCTATOC 
TTviGTCTOAC TGAGCCTAAA C/TTGAGAAAA TOCGTGCCAA GOCCAGTGCC AGTG7CTTGG 



ATTTTGAGTG 1TCTGTATGT ATCCTGCTCG GAGGTTG! :A TA7AGAAA1 



18 0 



GATGTCGCCA GCATTACCTT CCACTGCC'ri TCTCCCTGGG AAC<TAGCACA GCTGAGACTG 
GGCACCAGGC C ACGTCTGTT GGGACCCACA GG AAAG AGTG TO^AGCAAG TGCT4TGGCTG 240 
AGCTTTCTAT CTTCTCTAGG CTCAGGTACT GCTCCTCCAT G2CCA/IG;GYT GGGCCGTGGG 3 00 

GAGAAGAAGC TCTCATACGC CTTCCCACTC CCTCTGGTTT ATAGGACTTC ACTCCCTAGC 
CAACAGGAGA GGAGCCCTC 2 TGGGGTTTCO CCRRGGCAGT AGGTCAAACG ACCTCATCAC 



3 60 
420 



AGTCTTCCI'T CGTCTTCAAG CGTTTCATGT TGAACACALrC TCTCTCCECT CCCTTOTGAT 430 



540 
6 00 



GGCCCCTTTG GCTCTCCCTC ACTCTCTGAG CCTCCAGCTG GTCCTC,GGAC ATGCAGCCA3 660 
GACTGTGAGT CTGGGCASCT CCAAG-IXXT 3 CLACCTTCAAG AAGTGGAATA AAT2TGJCCT 720 

734 

TT-XTTTCTAT TTAA 



{2) INFORMATION FOR SEQ ID NO: 149: 

30 ii) SEQUENCE CHARACTERI57TICS: 

(A) LENGTH: 14 05 base pairs 
(3) TYPE: nuclei z acid 
/ ~. , '-.TRAiTE EENE^S : '.'loilfale 
{ D> TOPOLCGY : linear 

35 

(xi ) SEQUENCE DESCRIPTION : SEQ ID NO: 149: 
GGCACAGTGG AC2CCAGACT OCCTCTOOX: CTTTCTCTGC OPGGCiGAGAC CCACTGTGTG 
40 C ATGGC ATC A CTGACTCC 2A TACCTCTGGC TATCAAAGGT TTCTMCCATG GCOACCCTGG 

aagsaaacca ga:^oagcta ga2a^aga tcaggtccct tgtattctgg ttocatcoct-ot 

Oty^AAATTGT CTCAIXXTTO^G OTIiTGTCCAG ARGC;TCCCTG cr?TCrCTCAR GGATOCCAAA 240 
TCTACAAGAA TCTCTCCTCT rCCAGTTCCT AT AAC C TC T C CTCC TTTTC TCCCTTIA^ iuQ 
CCTTGCAGTA GTAGCAGCCA C^TTCTTTCT ATCTCTGGGT 7AGT0CATTA TCTCTGGTGG 
50 CTCCCTTACC CAGGACTTTG GGAATGGTCT TTTTGTAATA CATTCTCCTC AAATAATTCA 



60 
120 
180 



360 
420 



-3CCCG 4R0 
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400 



TGTTCTAAAT AACTCCMACA AGGA^.RTCAG 
TAATATCTTT CCMYGGAAA3 AWAA.T GAT AT 

5 

ANTCTGTGTA TTOOCCCTGG GGTGC-3CCAG 
G' ITT A 3 A AAA TT< 3 AGATAA 3 33 C TT ' ATTC T 
10 A< j AATAAAAT GAGAG3ATGT GTGTCAA3GG 
TC'TGAiXACC TCCATACTGT TGACA3TCAA 
CITCCCTT.^A TGAG3TGTGC GATGTACAAG 

15 

AGGAAACACT TAG-3AAACT3 GGCTTTCTGC 
' f AATT AAAGT TTTTAA^VM V CAATTTG3AA 
20 ATAAGGAGTC AGT3CATGA3 CTAACCG3TC 

c;taagtttat CACNTTCTTT cagggactga 

GATGTCCTAT G3GGTCACAT TGATC 

25 



C AC ATTTGG A ATATCAV.TAT CTTTCCAT3A 7 30 

T3CMAACTGG GAGTGTCCCW AGCAHATCTG 34 0 

CCCCTTAGAC TCTATC-GTCT CATTCTCTTT 90 0 

CTCCCCAO:: CACCCATCCA TATT3TTTTG 960 

^"^XTTT^j CA ~^ rr ^■~, r i^ rrx '^ t-3 ?vTT IG'-'O 

GTAATATTTC ATCAG3ATTC CATTCACtGMT 108 0 

AGTYGTGA33 TG3CAAAGGA T C-GGCT C C TG 1 1 4 0 

CATTAAAACA GACAAACGTT TGTGGTGACC 120 0 

AGTI 1 AGC A.-. G CTAGCTCCTK TGC AGGVJAAA 12 60 

CCGGGCTGCT TGCCATTCCA AACAACTGCA 13 2 0 

GGTTTCCA>3 C AC AG A C TTG GATAAGGAAG 13 3 0 

1405 



(2) INFOrLMATIC N FOR 3Ev IE» NO; 150 



30 



(i) SEQUENCE CHAPACTERI5TICS: 

(A) LENGTH: 2890 base pairs 
(E0 TYPE: nuclei:: acid 
(C) 3TRANDELNE5S : double 
35 {D) TOPOLOGY: linear 

(Xl) SEQUENCE LE3CE IPTION : SEC? ID NO: 150" 

TTATATGCTA CAGCTACAGT AATTTCTTCT CCAAGCACAG AGG AI JCTTTC CCAGGATCAG 60 

40 

GGGGATCGOG CGTCACTTGA TGCTOCTGAC AGTGGTCGTG GG AGC TGG AC CTTCATGCTCA 120 

AGTGGCTCCC ATGATAATAT ACAGACGATC CAGCA02AGA G AAGC TGGGA GA. C TCTTCCA 180 

45 TTCGGGCATA CTCACTTTGA TTATTCAG VZ, GATCCT3CAG GTTTATGGGC ATCAAGCA(3C 24 0 

CATATGGACC AAATTATGTT TTCTX LATITAT AGCACAAA3T A PAAC A ; 33CA AAATCAAAGT 3 GO 

AGAGAGAG2 2 TTGAACAA" "XT CCAGTCCCGA GCAAGCTG3G CGTCTTCCAC AGGTTACTGG 36 0 

50 

GGAGAAGACT C AG AAGGTG A CACAGGCArA ATAAAGCGGA G3GGTGGAAA GG ATGTTTCC 420 

ATTGAAGCCG AAAGCAGTAG CCTAA03T2T GTGACTACG3 AAGAAAC 3AA GCCTGTCCCC 4 SO 

55 ATGCCTGCCC ACATAGGTGT GGCATCAAGT ACT AC AAA 3G G3CTCATTG 3 AC G AAAGGAG 540 
GGCAGGTATC GAGAGCCCCC GCOOACCCCT CCCGGCTAOA TTGGAATTCC CATT AC TG AC ' " 6G T 0" 

TTTCCAGAAG GGCACTCCCA TCCAGCCAG3 AAACCGCC3G ACTACAACGT GGCCCTTCAG 650 

60 
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10 



50 



960 
1020 
108i.) 



1320 
13 80 



401 



AGAT7GCGGA TGGTCGCACG ATCCTCCGAC ACAGCTGG2C CTTCATCCGT AGAGCAGCCA 720 
CATGGGCATC C CAC C AGC AG CAGGCCTGTG AA C AAA G G T C AGTGOZATAA AYC2AACGAG 7 30 

TCTGACCCGC G7CTCGCCCC YTATGAGTGG CAA03GTTTT CCACC 3AGGA GGATGAAGAT 84 0 

GAACAAGTTT CTG7 TG TTT G A' G 2C AC AG A '2 TTTT7TGGAA GCAGA'GGGAG C C A CCTGAAA 900 
GGAGAGCACA A : GAAGAC 3TC CTGAGCATTG GAG 7 7TTGGA ACTCACATTC TGAGG A 2 GGT 
GGAOTAGTTT GCCTCCTTCC CTGC 7 TTAAA AG7A7CATGG GGSTTCTTCT GCCCTT7TTC 
CTTTCCCCTT TGCATGT3AA ATACTGTGAA GAAATTG7CC TGGCP.C1TTT CAGACTTTGT 
15 TGCTTGAAAT GGA2AGTG7A GCAATCTTCG AG C TC C '2A C T G^GCTGCCT G7CACATCAC 114 > 

ACAGTATCAT TCCAAATTCC AAGATCATCA CAACAAGATG ATTCACTCTG CCTGCACTTC 1200 
TPAATGCCTG GAAC^ATTTT TTTTAATCTT CCTTTTAGAT TTCAATCCAG TCCTAG7ACT 12 60 

20 

tgatctcatt gggataatga gaaaa<octag cgattgaact acttggggo. tttaacccac 

CAA2GAAGAC AAAGAAAAAC AATGAAATCC TTTGAGTACA GTGCTTGTCC AC ' rTG TTT AC 
25 AATGTCCTCC TTTTAAAAAA AM,V^AAT«5A GTTTAAAGAT TTTGTTCAGA GAGTAAATAT 1440 
ATATGGATTT AAT< 2ATT AC A GTATTATTTT AAA* 7CTTAAG TAGGGTTGCC AG7CTGGTTT 15C0 
CTGAAAAACC AAAT ATGC C G GACA2GGTGT GGCCACACCA AGAAGACGGG AAGACCTGGC 1560 

30 

TTCTCACCCT GGCTTCCCAT GTC2TTGTGG TCTCACCCG7 GAAGTG7CCT ATCCTGGAAG 16.0 
TA'Ilj AAATG'I " 'l'AUC'.-AA'i'l'A ai^.'.rtnunu ^ „ j. v.^ * - * ------ 

35 GTTCTTCTGT AAAACTGTTT GCACATGGCC AGGGGAGG2A ACTAGGACCC TTGTGTCCTG 174 0 

TCT 7AGCCTT ATGGAGG 2 AG GA7CGTGTCA TTGG7G:;ATG TGTCCTQ7TC CATTGAGATG 
GATGG2AAAC CCGATTTTTA AGTTATATTT CTTTCATTTT TGTTAATTTA GAGGTGTAGG 

40 

TTTTGTTTTT TGTTTTTTTG TTTTTTTTTA AG AG AAAC AT TT AT 'AACTGG ATAGCATTGC 192 0 

AGTGAAAGCA G 2 TTGGG ATG TTC X J AGCT AA TCCCAGCTGT CTATACTG2T CTTTCAACAC 19 30 

42^ AGXTCCCTT TATTGAATTG C.X7^ATTACGGA ATAAaLAa^ ^'' ^-^ f^™.^.-., 

CAAAAAGCTG GTT AG AC A T G CCAG7CTTTG CAAGXGXGGT TAGTCACCAA AGACTAACCT 212U 
7CAAGTG2CT TTATGGACGC TGGATATAGA GAAGG 2 1 2T AA GTGTAG7AAC CATCTG2TCA 2160 
CAGGTGCTAT TAACGCTATA ATGACTGAAA TGACCCCTCC ACTCTATTTT TCTGTTGTTT 222 0 

T-v-A -A VC-T G c ' GG A AAAGT GAAGG7TGCC AATCTCAGTA GTA2T7AAAT GTG AG.7AACT 



1300 
1860 



2230 



AAT.V 



60 



WO 98/54963 
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TGTC AATT AT GCATTTGTAA TTTTACATGT 
AGGCTATGGA C7TCATGT0C AT AT AG AAA 1 , j 
5 AAATGTTATC TAAGCATTAA GTAATTGTAG 
GTGATGTCAA GTGCAGAAT :- TACAATTAAC 
TGTACCTGTA TGTCTTTTAG AAAGACATTG 

10 

T AC AAT AAT T GTAOATATTC GTTATATTTT 
ATGCTTCTAC ATC CAGTTT Z TAG AAGC TO Z 
1 5 AAAAAAAAAA 



402 



AATATGCATT ATTTGCCAGT TTT ATT AT AT 2 52 0 

AC AGAAATC T AGCTCTACCA CAAGTTGCAC 2 580 

AACATAGGAC TGCTAATCTC AGTTCGCTCT 2 64 0 

TGGTGATTTC CTC AT AC TTT TGATACTACT 27 00 

GTGGAGTCTG TATCCCTTTT GTATTTTTAA 27 6 0 

TGTTGAAGAT GGTAGAAATG TACTATGTTT 2 82 0 

AAAATAAATA AATATAACAT AAAAAAAAAA 2 88 0 

2 8 ( r0 



20 (2) INFORMATION FOR i;£Q ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH 2 3 99 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS : double 

( D } T jPOEOGY : 1 mear 

txi) SE^tTENCE DESCRIPTION : SEQ ID NO: 151: 
30 GAACTTTTCC ATCTGGCAAA CCGGAAACTC CATCC CCATT AAACCAACTC CCCCTTTTGG CO 

TTTCCCCCCC AGNGGAATAG AATTTGGACN C C C AT AT AAA TCCAGGAAAC CACCTAAATT 12 0 

CTTTAGTNGT TT3TGTTTO^ AAGATCTAAG GTCATGGTAA ACATTAAGTT CTTAAAATTT 130 

35 

TTGGGAGGGA CCAGTGCACC TCTCCCTCTG AATTGTTCNC CAATTTAAAA TTGGAGTAAG 24 0 

GTTTTAAAAT GTCTNATTCC ATTGGAA3GG TNTGTTATTT CATTTTGAGC CCAGAGGGGA 3 00 

40 GAGGCACATT TTAAATATCA GAATTAGATT AGCTTTGAGT TTGTACAATT 03GAACATAA 3^0 

T AG ATTTTC A TAAATTATGT GTGCCTTGTT GGAAGTGTCA ACTGTCTTTA TGTCTGCTTG 42 0 

TAAAAGTTTC AAAATATGTT TTCCCTCAAA AAGGCAACGT TACTTCATTT GCTTGAATAT 4 80 

45 

TATGATAGGA ATGCTTACTG ATATTACTTG ATAGTCATAT ATAGCCTAGG AAATTTAACA 540 

T AT AT AT AA.C TATAGCAGTA TTAATAATGA TAGTTGTACT TCTTTAAAAC ATTAAATTTG 600 

50 AGGAAACTTT AATGCTGT 2T CGTGTACATT GC TTT ACT AC AGTGAGGGGG AAT AT C CTTT 660 

AGATTGAGCC T CAATTT ACT GGTTAGTAGT ATGTGAACTC TGGTATAAAA ACGTAAACTA 72 0 

GACAGTAGAG CCGATGAATT AAAATTGTAA ATTGCTACAT T03CATTTTC TACCTCCTTT 7 80 

55 

TC TGTC AG AG TATTACTTTT T CCAGC ATTT ATTCTTATTT GTGAGTAAAG AGGAAATGGG 840 

AACCTGAGGT TAAAATTGAC ATTTTTGTTT CATTGAGAAT TTAAGCAGTA GGTACAGGAG 900 

60 AAGTGACTTG TC AC ATT AAT TTGGTGCCTA AATCTGTAAC T AC AAGTTG T GATCGACATG 960 
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403 



15 



TAG AAAAT G T CTAAGAAAGG TCATATGCTG AATATTTTAC TTTTCCTGTA TAGTCTGCAT 102 0 

GATTTGTTTC ATAAACCCAG CTTATTTCCT CCAAAAAGCA AAATGG TC C T GTAATTTTTA 108 0 

5 

AA GT AAAAT A AACGTGCCAT TTTGTCTGCA ATCTATAATT TCAGGAAGTT ATTGRAAGTT 114 0 

ctgactcagg o:r: y nTAAC agttcaagca attgt c ag tt atattttgga aactccatct 1200 

10 GTGTAATTCT CCAGTGC2TT GAAAGAATTA TTAACTTGGC AACACTATTA AAA CTTT AT A 12 60 

AAAGATGGTC TTTAGTGCAC GTGTAT2ATT ATATACACGT TTTAAAGTCA TATTOCTTAG 13 2 0 

CTTGTTAATA ATGATTGTCC ATGTGTGCTG GGTTTGGGTA ATT CTTT AAA GGAAGTTTTC 1390 

TAGATTTGCA CTTGATGTTT 3TTTTTTAAA AAC TG ATT AT TTATGGCCGT GACACTGTTA 144 0 

CCAGAA AAGT AATTCTAATT AAGTTATTAT GC AAAGTC AT CTATAAGTAG CATCTGGGAA 1500 

GAGGAGATSG AGGCCA^AO'T TTGCTATTTT AGTATGAAAG GAGGATCTGT TTGG3AAACA 15^2 

TAGATTGTCT T CCCCT CAAA TGAGGGGAAA AAAAAAGACC CTTTGTTC AA AT G GATTC TG 162 0 

TTGTAAAAAA TTATTTTTAA aggaaatcac AAATTGTATG TCATTCTTAA TGCTAGTCTT 1630 

ATAGAATAAA TC CAT AAAAT TGTTTTTATG TTCAGTATGT TTATGTCATT CT AAATGC AG 174 0 

CAAATTCAAT GATAGCAGTT CAATTGACTC ATAGCAGTGT TTTGTATTTT TT 2TAATTCT 1800 

30 TTAG2TTTCA ATATTGCIACT AAAGTCTTGT TTGTGAATAT AGTTTCCGTA TGGCAAATGA 1860 

TTTCTTGCTT ATTAGCTTTT GTTAAAGAAT GCTTAGTAAG AGCTAAGCTT TTAAAAGTAA 192 0 

TGCAAACATT TATCGTTAAT AAAAC CT AT G GTGTAATATC ATATAATGCT TTTCTTTGAT 19&0 

35 

CTTTGGAGAA TTATTCTTTT ATAGTAGTAT ACATGAATTT TCATTTTTAA AGCATTTAAA 2 040 

AACAAATCTC AAT AC ATT AA AAAACCTGTT ATTGTTAAAA RGGAAATTAC CATGCCTTTA 210 0 

40 AGAAACAAGG ATGTACATCT TCAATTCAGC ATRAGTGTCC AC ATCT AG AA GGC TC TCATT 2160 

GAG TTGTTT ACAGTTAAGG TACCTCTATC TAAAGGGCCA AAGAAGCATT TCATA'iTTTA 22 2 0 

A~ACCTCACA TTCTTTCAGG ATCAAGACAT ATG AAAAT AG TCTGAATAGG ATAAATTTGG 22H0 

45 

ATAGG WGTA ACTTAA' 2C AG TCTOGGAAGA TTCAGGCTTT TPCTATKAAA AAGCTTATTC 2 34 ? 

CTCTTCACAA CTCNGG rGGT AGG 1 JTTT 1 2 AT TTTTCAAGAG GGTAGATATT TTAAAGCCA 2 3 99 

50 

(2) TT.T'ORMATTOr-J FOR 5F.Q ID UO : 152: 



60 
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txi) SEQUENCE DESCRIPTI :N : SEQ ID NO: 152: 

CGT 3CCTGTA GTAA3CTCAT COCTGCCTTT GAGATGGTGA TGCGTOCCAA OS AC AATGT r 6 0 

5 TACCACCTGG A- 2 TG 3TT P 1 3C ATGTCASCTT TGTAAT2AGA GATTNTGTGT TGGAGACAAA 120 

TTTTTC:TAA AGAAPAACOT GAYCCTTT<3C CARACG3ACT AC 3AG3AAGG TTTAATGAAA 180 

GAA'SGTTATG CACCC-2M3GT TCSCTGATCT ATCAACATCA CCCCATTAAG AATACAAAGC 240 

10 

ACTACATTCT TTTAT7TTTT TTSCTCCACA TGT AC AT AAG AATTG AC A<2A GGAA^CTACT 3 00 

GAATAGCGTA G AT A T AGS AA G3CAG3AT3G TT AT ATSG AA TAAAAGGCGG ACTGCATCTG 3 60 

15 TATGTA3TGA AATTOSCCCA GTTCAGAGTT GAATGTTTAT T A TT AAA 1 3 AA AAAA3TAATG 42 0 

T AC ATA TGGC TGGATTTTTT TGCTTGCTAT TCGTTTTTGT GTCACTTGGC ATGA 3 ATGTT 4 30 

TATTTTGGAC TATTGTATAT AATGT ATTGT AATATTTGAA (3CACAAATGT AATASAGTTT 54 0 

20 

T ATT CTGTT A CCATTTGTGT TCCATTTGCT YCTTTGTATT GTTG 2 ATTTA GTACAATCAG 600 

TCTTTAAACT TACTCTATAT TTATGCTTTC TGTATTTACC AGCTATTTTA AATGAGCTGT 660 

25 AACTTTCTAG TAAAG AATTG AAAAGCAAAT CCT C ACT AAA G3 AT AC AC AG GATAGGATAA 72 0 

AGOCAAGTCN CATCAACATT AAAAAATACT AAAANANAAA ACACAAAAAA AAAAAANCCC 780 

GGGGGX:X3G2C CGGAACCCAT TC SO 2 

30 



(2) H FORMAT I Of I FOR SEQ ID NO: 153: 

35 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: -461 base pairs 
(3) TYPE: nucleic acid 
(2) STRANDEDNESS : double 
40 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

CTAGGAGCAC CGAGCAfOCTT GGCTAAAAGT AAGGGTGTCG TGCTGATGGC CCTGTGCGCA 60 

45 

CTGACCCGCG CTCTGCNCTC TCTGAACCTG GCGCCCCCGA CCGTOSCCGC CCCTGCCCCG 12 0 

AGTCTGTTCC CCGCCGCCCA GATGATGAAC AATGGCCTCC TCCAACAGCC CTCT3CCTTG 1B0 

50 AT3TT3CTCC 2CTGGCGGCC AGTTCTTACT TCTGTGGCCC TTAATGOCAA CTTTGTGTCC 24 0 

TGGAAGAGTC GTACCAAGTA C AC CATT AC A CCAGTGAAGA T 3 AGS AAGTC TGGGSGCCGA 300 

G AC C AC AC AG GTGGGAACAA '3GACAGGGGG ATTTAAGCAG T2AAAA3GAA AAA2ATGTTA 3 60 

55 

AGACCCTASA CTTGTATATT GACACACTTG TACCTTGTAA GSCACAGGAA TGTAATTAAA 420 

AAGC AC TT AT TTGGCWNAAA AAAAAAAAAA AAAAAAAAAA C 461 

60 



WO 98/54963 PCT/US98/1 1 422 
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(2) INFORMATION FOR SEQ ID NO: 154: 

5 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2383 base pairs 

(B) TYPE: nucleic acid 

( C ) STRAITE EDNE 3 S : do lib 1 e 

(D) TOPOLOGY: linear 

10 

( x 1 ) S EQ' TEN 0E DESCRI FT I ON : S EQ ID NO : 1 S 4 : 

GCCCACQ2GT CCGAAAOCGG AGAACGCTGG TGGGCCTGTT GT<2GAGTAC3 CTTTGGACTG 60 

15 AGAAGCATCG A3GCTATAGG ACGCAGCTGT TGCCATGACG GCCCAGGGGG GCTGGTGGCT 120 

AA C C GA G 3 2 O Q3CGOTTCAA GTO3G02ATT GAGCTAAGCG GGCCTGGAGG AGGCAGCAGG 130 

GGTCGAA3TG AC0O3CGCAG TGGCCA3GGA GACTCGCTCT ACCCAGTCG3 TTACTTGGAC 240 

20 

AA3CAA3T32 C TGATACCA 2 CGTGCAAGAG ACAGACCGGA TC 2TGGTGGA GAAGCGCTG2 3 00 

TG32ACATCG CCTTGGGTCC CCTCAAACAG ATTCCOATGA AT 2TCTTCAT CATGTACAT 1 2 3 60 

25 GCAG3CAATA CTATCT 2CAT CTTCCCTACT ATGAT3GTGT GTATGATGGC CTGGCGACCC 42 0 

ATTCAGGCAC TTATG02CAT TTCAGCCACT TTCAAGATGT TAGAAAGTTC AAGCCAGAAG 4 80 

TTTCTTCA 2G GTTT I 2 J 2TCTA TCTCATTG3G AACCTGATGG GTTTGGCATT GGCTGTTTAC 540 

30 

AAGTGCCAGT 2CATGGGACT GTTACCTACA CATGCATCGG ATTGGTTAGC CTTCATTGAG 600 

CCCCCTGAGA GAATOGAGTT CAGTGGTGGA GGACTGCTTf TGTGAACATG AGAAAGCAGC 660 

35 GOCTGGTCCC FATCTATTTG GGTCTTATTT ACATCCTTCT TTAAGCCCAG TGGCTCCTCA 72 0 

G7ATACTCTT AAACTAATCA CTTATGTTAA AAAGAAC2AA AAGACTCTTT TCTCCATGGT 73 0 

GCG2TGACAG GTCCTAGAAG GACAATGTGC ATATTACGAC AAACACAAAG AAACTATACC 84 0 

40 

ATAAC'OCAAG GCTGAAAATA ATGTAGAAAA CTTTATTTTT GTTTCCAGTA CAGAGCAAAA 900 

C7J\CAACAAA AAAACATAAC TATGTAAACA AG AGAATAA. 2 TCCTGCTAAA TCAAGAACTG 960 

45 TTGCA3CATC TCCTTTCAAT AAA IT AAATG GTT GAG AAC A ATGCATAAAA AAAGTTGCAC 102 0 

AAGTTCCTTA TTTTCCTTAA TATTTCACTT CTATTTAATA CAAOCTGGGA CATAAAAATT 108 0 

CTGTTGGGGA TACCTGGGGG AAGATGTGAG AAACTAATGC TGAATTCAGC TTATACATGA 114 0 

50 

TG AAAA' 3 AAA AACCA 3ACAA AAGGAGCACA TAAATATGCA TACAGTGTAA CTGTTATTAT 1200 



60 
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GTCATAGGTA AGACTCAAAA GCGGGATCTT ATTCAAAAGG CAGGTATTTC C7TTGTTTTC 1500 

TGTCTTGAAA TAGCCCCTTC CGCTAAGGTG CATTCTCTC A AGTTTTCAGT ATTGCTTT AT 1560 

TTGCAGTGAT T AAAAG AG AT GAGAGACTTT GGAGACAGAC AACGTAAGCA AG ACAT AG AC 1620 

AGATGAAATA CTCTAGACAG AGATGAATAT AAATCTGGGC TAATAACCAG TTTTCCATGT 16 80 

AACAGTGATT TTGTGTTTCG GGCTGAAGCA GT GGTT AT AT TAAAAGCCAC TAATTCCCTT 17 40 

ATCCCTTTAA AAGATTTTTA CAATTCTCCA ACCACAAACA GC ACTT CT AA AACTAACTTT 18 00 

ACTTTCTGCC CATAATTTGT TCTACATGGA AAAAAAAAAT ATTACTTTGG CCAGGGGTGT 136 0 

15 GTGTAAATGT GGCAGAATTC CTAGGCAGGC TGACCTTTAC AGTATGGGCC TTTAAGATAC 192 0 

TGGATCCTGG TTGOGCAACA AGTGTC ACGO CTGAAGTTTC TGAAAACAAA TTAGAAGACT 19 BO 

GTTGGCTTOG CTAATCTCGT AGTTCAGGGO CAAGTTTCTG TAGTCAGAAT GAAGAATAAA 204 0 

20 

ATTG AAAG PJ\ AAAGG.GGGAA ATGCTTATAG TTGGCATTAA GTTGAATGCC TCAAGTCTTA 2100 

ACTATGGCTT TGTA3ATGAG GCAAAA3ATT TCTTAGTGGT AAAATTTCTT CAAC AGGT C A 2161 

25 ATGCCAATCT GTATGCCATT TTAGTAAAGT AGGTAAGGAG AGTAGCCGCT CAGTAACTTT 2 22 0 

GGC ACT AAAG AAACAGTGTG GCTCTAGAAC TTCCAATCCC ATTGCTAGAT GTGCCCTTTA 2 2 80 

AAAGATGGTG CAGI GCTTTG A3GGAAGGAT GTTTAGCCAG TTTTC CTAGT ATTTGTTCCT 2 3 40 

TAAGATTTTT TGACCTGTGC TTAATAAGAC GGACGCGTGG GTCGACCC 2 3 38 



30 



35 



45 



(2) INFORMATION FCR SEQ ID NO: 155: 



( i ) SEQUENCE CHARACTERISTICS ; 

(A) LENGTH: 642 base pair£ 
40 (3) TYPE: nucleic acid 

(C) STRAHDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 155: 

AAAACAGACC ATTTAAAAAC TCAGACAAGA TTATATTTAA TATATTAATT ACTAAAAAGG 60 

CACAAGATTA CACTGAACAT ATTAGCTACT AAAAAGGCAC TGCTAAGACA TTCAAGCAAA 12 0 

50 TAGCTATTAC ACACTACTGC AC^ATTTTACA GGTTTC T AAT TCTAACATAT GTTTGAAAAA 180 

TCCGTGAGTA TTC ■ 2AAAAT A TATTTAATAA TGGAATATCT GCATTAATAT AC C AT CCATG 240 

TGTTTTTACC ATTTGCCTTA ATATTGAATA TACTGTTTAC CTCACACTAA AAAGAAAACC 300 

55 

AGAAGCCTT A TTTGTGATTT TGGGAGTGGA AGCTTCCATT TTTGTGTCAA AAATG AATC C _ 360 

TGATTCTTAT GGAAATCTCT GTTATTAAGA TATTTCAAGA TGAGACAACA CTGAAGATCA ^42 0 

60 AATTGTGTTT AGTATCACTA TCTTCTCTCC TC3TTTCTCT CTTACTCCTC ATCCTCCCAG 48 0 
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AATCT AC C AG TTT AT OG T AG AAAGATaGOA ACCTTATTTG AATGTGTTTT TTTTTTTCCA 54 0 

tgatgtccaa ttttgttgtg ggaaaggatt TGGATAAAAT ttttgtttaa ATTTTGGTAG 600 

ATTTTTATCT ATACAAATTT AAATAAAATT ATGTTTTGTA AG 64 2 



2) INFORMATION FOR SEO ID MO: 156: 



( i ) S EOUENC E CHARACTE RI3TICS : 

(A) LENGTH: 1 2 5 1 base pairs 
15 (10 TYPE: nucleic acid 

■: ( ; ) STRANDEDNES G -. double 
(0) TOPOLOGY: linear 



I Ml) CLCHTGNCE DESCRIPTION: GEO ID NO: 156: 

GCCGCTGCCC CTCCACGGAG TTOCTGATCA TGTGGGCTOT CAT G C ACAAA CCCGGTTCTT GO 

TGTCCCTCCT AATATCAAG.C AGTOGATT'OO CTTGCTGCAG AGO 3GAAACT GCACGTTTAA 120 

25 AGAGAAP^TA TCACGGGGCG CTTTCCACAA TG^GAGTTGCT GTA GTCATCT ACAATAATAA 130 

ATCCAAAGAG G A GG C A G IT A CCAT3ACTCA TCCAGGCACT 'GAG GAT ATT A TTGCTGTCAT 240 

GAT AAC AG AA TTGAOGG3TA A-GGATATTrr GAGTTATCT3 1 GA 3 AAAAACA TCTCTGTACA 3 00 

30 

AAT 3 AC AAT A GCTGTTGGAA CTCGAATGCC ACCGAAGAAC TTCAGCCGTG G3TCTCTAGT 360 

CTTCGTGTCA ATAT^OTTTA TTCTTTTGAT GATTATTTCT TCACCATGOO TCATATTCTA 42 0 

35 CTTCATTCAG AAGATCAGGT ACACAAATGC ACGCGACAGG AAC GACGCGTC GTCTCGGAGA 48 0 

TGCAGCCAAG AAAGGCATCA GTAAATT3AC AA3CAGGACA GT AAAG AAG 3 GTG AC AAGGA 540 

AAC TGACCC A GACTTTGATC ATTGTGCAGT CTGCATAGAG AGC T AT AAGG AGAATGATGT 600 

40 

CGTCCGAATT CTCCOCTGCA AGCATGTTTT CCACAAATCO TGCGTGGATC CCTGGCTTAG 660 

TGAACATTGT AOCT JTCCTA TGTOCAAAOT TAATATATTG AAOGCCCTG3 (GAAGOTGTGCC ~ ? 2 0 

45 G AA T' PTCrC OA TGTA. ."OGATA ACG TAGCATT C GATATGGAA AvGCCTCACCA G^AACCOAvAGO V30 

TGTT AACC G A AG AT G AGC CC TCG3CGACCT C 3CCGGCGA G AACTCCCTTG GCCTTGAGCC 84 0 

ACTTCGAACT TOGO GGATCT CAC'CTCTTCC TCAGGATGGG GAO'GTCACTC CGAGAACAGG 000 

AGAAATCAAC ATTC-GAGTAA CAAAAGAATG GTTTATTATT GCCAGTTTTG ■GGCTCCTCAG 96 0 



60 
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\TAAAAAAAA AAAAACCCCG GGGGGGGCCC GGTCCCCAAT T3GCCCTATG G 12 51 



5 

(2) INFORT-ATION FOR SEQ ID NO: 157: 

fit SEQUENCE CHARACTER I STICS : 

(A) LENGTH: 2127 base pairs 
10 { 3 ) TYPE : rue 1 e i c acid 

' C) STR^TDEDNECS : dcuble 
■ D ) TO PO LC G Y : linear 

(xl) SEQUENCE DEGCRI PTIOM : SEQ ID NO: 15 7: 

15 

CCGGCGGGAG A<X;GAAGOTG CAGCGAGAGG CGCGGATCTC AGCGCGGGAG CAGTGCTTCT 6 0 

■JCGGCAGGCC OCTCiAGO^AG GGA<:x;TGTCA GCCAGGGAAA ACCGAGAACA CCATCACCAT 12 c 

20 GACAACCAGT CACCAG2CTC AC SG AC AG AT A CAAAGCTGTC TGGCTTATCT TCTTCATGCT lHO 

GGGTCTG3GA ACGCTG2TCC CGT X^TTT TTT'CATGACG GCCA2TCAGT ATTTCACAAA 24 0 

CCGCCTG3AC ATGTCCCAGA ATGTGTCCTT GGTCACTGCT GAACTGAGCA AGGACGCCCA 300 

25 

GCCGTCAGCG CMCCCTGCAG CACCCTTGCC TGA'jC'GGAAC TCTC TC AGTX j CCATCTTCAA 3 'SO 

CAATGTCATG AirCCTATGTG COAT3CT3CC CCTGCTGTTA TTCACCTACC TCAACTCCTT 4 20 

30 CCTGCATCAG AO 3 AT C C C C C AGTCCGTACG GATCCT3GGC AGOCTGGTGG CCATCCTGCT 430 

-^GTGTTTCTG ATCACTGCCA TCCT3GTGAA GGTGCA3CTG GATGCTCT02 CCTTCTTTGT 540 

2ATCACCATG ATZAAGATCG TGCTCATTAA TTCATTTG3T GC 2ATCCT02 AGGGCAGCCT 6D0 

35 

< jTTTGGTCTG GOTGGCCTTC TGCCTGCCAG CTRACAC3GC CCCCATCATG AGTGGCCAGG 66 0 

■ 3C C T AGCAGG CTTCTTTGCC TCCGTO 5CCA TGATCTGC3C TATT3CCAGT GGCTCGGAGC 720 

40 TATCAGAAAG T3CCTTCGGC TACTTTATCA CAGCCTGTGC TGT KATCATT TT 3 AC C ATCA 7 80 

TCTGTTACCT C-3GCCTGCCC CGCCTGGAAT TCTACCGCTA CTACCAGCAG CTCAAGCTTG 940 

AAGGACCCGG O'iAGCAGGAG ACCAAGTTGG ACCTCATTAG CAAAGGAGAG GA3CCAAGAG 900 

45 

CAGGCAAAGA 3GAATCTGGA GTTTCAGTCT CCAACTCTCA GCCCAOIAAT GAAA- >2 C ACT 960 

CTATCAAAGC CATCCTGAAA AAT AT CTC AG TCCT3GCTTT CTCTGTCTGC TTCATCTTCA 1020 

50 CTAT2ACCAT TGOIATCTTT CCAGCCGTGA CTGTTGAGGT CAAGTCCAGC ATCGCAGGCA 1080 

GCAGZACCTG GG AACG TT AC TTCATTCCTG TGTCCTGTTT CTT3ACTTTC AATATCTTTG 114 0 

ACTGGTTGGG CCGGAGCCTC ACAGCTGTAT TCATGTGGCC TGGGAAGGAC AGCCGCTGGC 1200 

55 

TGCCAAGCTG GNTGCTGGCC CG-3CTGGTGT TTGT-3CCACT GCT'JCTGCTG TGCAACATTA 1260 

AGCCCCGCCG CTACCT<3ACT GT 3GTCTTCG AGCACGATGC CTG jTTCATC TTCTTCATGG 1320 

60 CTGCCTTTGC CTTCTCCAAC GGCTACCTCG CCAGCCTCTG CATGTGCTTC GGGCCCAAGA 13 8 0 
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20 t:tccc:?<x: gcctcctcct ctgtgttctc tccatgtccc cctcccaact ccccatgccc 
agttcttagc catcatgcac cctgtacagt tgccacgtta ctgccttttt taaaaatata 

TTTG AC AG AA ACCAOSTCCC TTCAGAGGCT CTCTGATTTA AATAAACCTT TCTTGTTTTT 

25 

TTCTCCATGG AAAAAAAAAA AAAAAAA 



30 



40 



(2) INFORMATION FOP SEQ ID NO: 15 £ 



:i) SEQUENCE CHARACTERISTICS : 

(A) TENGTH: 16 2S base pairs 
35 <£) TYPE: nucleic acid 

(■:;) STRANDEEWESS : double 
(D) TOPOLOGY: linear 



(x.i) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 

;aaaa:atct ataatcagga cattgtttat gtaagttgga caanaaaaat tcttcccctt 

TATGT ITCACC CTTCCTATGA TTGCAAGACA AAATTTC CCT CCTTTACCTC ATCCCTATAA 
45 CATGGGAGGG TGACJ^\AAAT GAGGGGAGAT G j AAC C AG AT ACAAGGACAT GCAATAAGAG 
AAGCTTATTT AAATATTGTG AAATAAAGGA AG AMCC AAAG CATTTTTTTA AGTGGGGAAT 
CCTTTTGAAC AGTTATTATT TATCCATATT ATTAAYAACA TCTTTTCTtSA CAAAATCCAT 
CAGATGAAGT GTAAATGGAT AA I CTTTT AA TO GAT CT AAA C2TAGAAAGT TTCACTTACT 

, _ ^^.r^^^^-nr--- - V\— TTCTT^T 



50 



1500 
1560 
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AA3TGAAGGC AGCTGA3GCA GAGACCGCAG AGC CATCATG GCCTTCTTCC TGTGTCTGOG 1440 
TCTG3CACTG GOGGCTGTTT rCTCCTTCCT GTTCCGGGCA ATTGTGTGAC AAAGGATG3A 
^ CAGAAGGACT OCCTGCCTCC CTCCCTGTCT GCCTCCTGCC CCTTCCTTCT GCCAGGGGTG 

ATCCTGAGTG GTCTGCCGGT TTTTTCTTCT AACTGACTTC TGCTTTCCAC GGCGTGTGCT 162 0 

10 GQGCCCGGAT CTCCACGCCC TGGGGAGGGA GCCTCTGGAC GGACAGTGGG GACATTGTGG 1680 
GTTTGGCCCT CAGAGTCGAG GGACGGGGTG TAGCCTCGGC ATTTGCTTGA GTTTCTCCAC 
TCTTGGOTCT GACTGATCCC TGCTTGTGCA GGCCAGTGGA GGCTCTTGGG CTTQ3AGAAC 
AOGTGTGTCT CTGTGTATGT GTCTGTGTGT CTGCGTCCGT GTCTGTCAGA CTGTCTGCCT 
GTCCTGGGGT GGCTAGGAGC TGGGTCTGAC CGTTGTAT'GG TTTGAC CTGA TAT AC TC C AT 1920 



1740 
1800 
1860 



1980 
2040 
2100 
2127 



60 
120 

: ro 

24 0 

200 
360 
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TTGAAGTCTO AGTCACCAGA CACAG3TTCT AT ACAA TT AA TGATGAGCTG 'GAGAAGTAAT 66 n 

ATGTA<3CTAA TTTTTCAAAA GCATTGAATA TACTTTCCGG AAAGAAAAOA ' AAAATT AAAT 7:!0 

5 ATTGCCACAT CTT02CAGAA TCCCATCTGA CACCTTAACT TTGTCAG3TT TCCTAZTAACT 780 

TGCTAATCAA 3TTTTATACA r r rCT AAATC T CCC IAGTTTC TTT03G<G::TG 'AAAGATG2AA 84 u 

CTTC<2ATTTA ATAGAAACTT T3AAATCTTG GG3TAAGG3A G2AGTG303G iGAC TAG3 3AG 900 

10 

AACrGATAAGA AATAGAATTA TT-iLAAAAGCC CCC AC C AG G Z ACCTTCCTG3 CCAGAATATG 96H 

CAGAGTAATT 3CTGCTGG:T TGACCTTTGA AAGTCCCTCG AAACTATOGA 'GATGAAACTG 102'"' 

15 AGTCTGTTTT TGATATTGTC AGATGTATTC TAC 2TTGGAA GTC02NACAC f 2T AAACT 3GA 103' » 

ATTCTTGTAT TTA CATCTCC TCCACTGTC Z CCCACACCAC CCCTCAATTC CTGCTGC 2CC 114 ■"> 

TG3TAATGTT AAG 3 ATTTTT CTCTTGTTAT 2ATCAQ3TTC ACATT.AA.AAM CAGRTATTTA 120' > 

20 

CAAA3* T3ACT T-AAAG3A3AG ATACTTTTAC ■ 3AA' rGT<3AT A AAATATTTT Z TTAAGAAAAG 126" 

GAAAGAG3AT GTGG3TGAAA TAAAACACCG CAT'SGATGTT GATT^GPAAA TACTG3TGTA 132 0 

25 AGAAAAG3GA G3T'2AG3AAT TTTTATT AC T 3TATTTGTAA ATGAGTTTGA A<3GAATTTGT 138!.' 

AAATGCCACT GGTACATTTT TAAG3TGACA -2ATTTGCTCC TTATAAAGTT ATTAAAAATT 144i' 

ACAGGGTAAG CTTAAATGAC GTTT*3C-3AGT AGTTTTACTT TATATAATGA AT ATT (3 AT AT 150=> 

30 

TGTTG3TGAA CTATGTAACT TTATGATG3A TTTTTCAGTC CCTTTTCA3A ■ G 3 AAAT Z 1 Z TT 156''1 

TT 1 3C AAT* G3 T A 3T AATGTTT AGTTTAAATT < 3ACTT AAT AA ATTMTTACCT 1 3AG3 AAAAAA 162 0 

35 AAAAA 162S 

40 (2) INFORMATION FOR SEQ ID NO: 159: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1687 base pairs 
(E) TYPE: nucleic acid 
45 (C) STRAND EPNESS : double 

( D ) TO POLOGY : linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 



50 CGGGGTCACC AGTTATTAGA G3AAGTAACA CAAGGGGATA TOAGTGCAGC AGACACATTT 60 

CTGTCCGATC TGCCAAGGGA T'GATATCTAT GTGTCAGATG TTGAGGAC3A OGGTGATGAC 12 0 

ACATCTCTGG ATAGTGACCT GGATCCAGAG <3AGCTG3CAG GAGTCAGGG3 AC ATCA3GGT 130 

55 

CT AAG3GAC C AAAAG2GTAT GCGACTTACT 1 GAAGTGCAA- Z ATGATAAAGA G3AGGA3GAG 240 

GAGGAGAATC CACTGCTGGT ACCA3TG3AG ( 3AAAAGGCAG TACTGCA'GGA AGAACAAGCC 30 0 

60 AACCTGTGGT TCTCAAAGGG CAGCTTTCtCT GGGNATCGAG GACGATGCCG ATGAAG3CCC 36 0 
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CANNTTT 

(2) INFORMATION FOR SF.Q I D NO : 160: 
ii) SHC'JENCF Cr-iARACTK FISTIC^ 



TGG A 1 1ATC AG TCAG3CCGAG CTGTTATTT3 AGAACCCGYG GAAoGGAGoG cAGCAGGAGC 420 

AGAA3CAGCA GCTGGCAOAG ACACCCCCTT CCT 3TTTOAA GACTGAGATA ATGTCTCCCC 43 0 

5 

TOTAGCAAGA T3AA<3CC::CT AAG3NAACA3 AG32TTCTTC GG3GACAGAA GCTGG'OACTG 540 
CGCTTGAA'OG G3AAGAAAA3 G ATGGC ATC T CA 3 ACAGTGA T A 3CAGTAC T AGCAKTGAGG 500 
10 AAGAAGAGAG CT03GAA3CC TCC3TGGTAA GAAGCGAA5C 3P3OG0CTAA AG TCAGATGA 550 
I GACG3GTTT 3AGATAGTGC CT A TTG AGO A CCCAGCGAAA CATCGGATAC TGGACCCCGA 720 
AGGCOTTGCT 0TAGGT&3TG TTATTGCCTC TTCCAAAAAG 3CCAAGAGAG ACCTCATAGA 7 30 

15 

r I AACTCCTTC AACCGGTAGA CATTTAATGA GGATGAGGGG GAGGTTCCGG AGTGGTTTGT 340 
GCAAGAGGAA AACrOAGCACC GGATACGACA 3TTGCCTGTT GGTAAGAAGG AGGTGGAGCA 900 
20 TTACCGGAAA CGCTGGCG-3G AAATCAATGC ACGTCCCATC AAGAAGG'i\rO GTGAGGCTAA 350 

GGCTAGAAAG AAAA3GAG0A TGCTGAAGAG GCTGGAGCAG ACCAGGAAGA AGGCAGAAGC 102 0 

CGTGGTGAAC ACAGTGGACA TGTNC AGAAC GAGAGAAAGT GGCACAGCTG CGAAGTCTCT 1080 

25 

ACAAGAAGGC TGGGCTTGGC AAGGAGAAAC GCCATGTCAC CTACGTTGTA GG CAAAAAAG 1140 

GTGTGGGCCG CAAAGTGCGC CGGCCAGCTG GAGTCAGAGG TCATTTCAAG GTGGTGGACT 1200 

30 CAAGGATGAA GAAGGACCAA AGAGCACAGC AACGTAAGGA ACAAAAGAAA AAACACAAAC 1250 

GGAAGTAACtO AGAGCTG3CA GGCTCCCAGG AG A GC ATGGG GACTAGGA<3G AAGGGTGTGG 132 0 

"ATGGCTCAG TCT^OCCOOC TTGATTACCG GCCTAGCCCC T07TCACATC ACAGCTGTCT 1380 

35 

GAAGAAGAGT GAGGTGGAGT GCCTAGAAGT CCCGTGGTGG TCCTGAGCAG AGAGGAGGAT 144 0 

GTCCTCCTG2 CTGCCTGAAG GTCTC C CATG AAAACACTGC TGAACTGTGT TGACACTCAT 1500 

40 GACCCTTTTT TTAAACCGTT AAAGGGAA 3T TG3GTGTT03 AGCGATACTC AATGTAGTCA 156 0 

GTCTACACCT GGACGTGTC-G GC 3ACTTAAG CCCTCCCCAC CCCCATCCTA TTCCTRAATA 162 0 

AAACCAGGAT AATG3AARAA AAAAAAAAAA AAAAAAAAAG 03GO~X3CCCN TAAAGOGNGC 15"0 
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GGATGACA3A TTGC3ACANA GATTTGTGAC CCTTCCTGCT GAACTTC A 3A GGGAGCTGAA 6 0 

ANCA3CGTAT GATCAAAGAC AAAG3CAGGG CGAGA=vCA'3C ACTCACCA3C A3TCA3CCAG 12 0 

CGCATCTGTG CCCCGAGAAT CCTTTACTTC ATCTAAAG3C AGCAGTGAAA GAAAAGAAAA 18 0 

GAAACAAGAA CAAAAAAACC ATTGGTTCAC CAAAAAGGAT TCAGAGTCCT TTGAATAACA 24 0 

AGCTGCTTAA CAGT3CT3CA AAAACTCTGC CAGGGGCCTG TGGCAGTCCC CAGAAGTTAA 3 00 

TTGATG3GTT TCTAAAAGAT '3AAG3ACCTC CTGCAGAGAA ACCCCTGGAA GAACTCTCTG 36 0 

CTTCTAGTTC AGGTGTGCCA '3GC3TTTCTA GTTTGCAGTC TGACCCAGCT G3CTGTGTGA 42 0 

GACCTC'3A3C ACCCAATCTA -3CTQ3AGCTG TTGAATTCAA TGATGTGAAG ACCTT-3CTCA 4 10 

GAGAAT3GAT AACTACAATT T'2A<3ATCCAA T<3GAAGAAGA CATTCTCCAA GTTGTGAAAT 54 0 

ACTGTACTGA TCTAATAGAA 3AAAAA3ATT TGGAAAAACT GGATCTAGTT ATAAAVTACA 600 

TGAAAAO30T GATGCA3GAA TCGGT 3GAAT 03GTT"rGGA^ TATG3CATTT GACTTTATTC 660 

TTGACAATGT CCAGGT03TT ■' PT A' 3A\ 1 2 AAA CTTATGGAAG CACATTAAAA GTT AC AT AAA ^20 

25 T ATT A* 2 CA 3 A GA-3CCT 3ATG 2TCTGTGATA GCTGTGCCAT A-\GT3CTT3T GAGGTATTTG ^30 

CAAAGT'OCAT GATAGTAATG CT03GAGTTT TTATAATTTT AAATTTCTTT TAAA3CAAGT 84 0 

^ GTTTT* 3TAC A TTTCTTTTCA AAAAGTGC C A AATTTGTCAG TATT3CATGT AAATAATTGT 900 

GTTAATTATT TTACTGTAG2 ATA' 3ATTCT A TTTACAAAAT GTTTGTTTAT AAAGTTTTAT 9 60 

GGATTTTTAC AGTGAAGTGT TTAOAGTTGT TTAATAAAGA ACTGTATGTA TATTT-3GTAC 102 0 

35 RGGCTCCTTT TKGTGAAYCC TTAAAAACTC AV3TCTA3GA RG3AACTA 2T GTTTATTATA 1080 

CTAAARG3CT GAAAAMCCTC CAG3CCAGAC TGCTAAGOTC TGAAATYCCT GAGAG3TCTC 114 0 

AGACCG3GAT TCTACTTGTT CCAAG AAA' 3G GTAAAGCTTC TAAACCATCT T ATT< "TT 3 TC 1200 

40 

TGCAAGCATG AACACAGGAG CATGTYAAGA AAATCTTTAG TACTTTCTYC CATGC.3GAGA 1260 

AATCTAGATA TTTTGAATTA GAAACACCGT CACACCCACT TGAAGATTTT TTTCCTGGGA 132 0 

45 ACATTATGTC CCGTAGATCA GAG* jT<3GTGT TGTCTTTTTG CTTCTACTGG CCATTGAGAA 13B0 

ACTTT 1 3 AT 3 A TAAAAAAGAA CGGTATAGAT TTTTCAAACG TATATAAAAT ATTTTTATGT 144 0 

^ T AT ATG TT AT GCCATAACTT TAAAATAAAA ATAGTTTAAA ATTCTATGCT AG TGG ATATT 1500 

TGGAACTTTT TCCTCAAACA AACACCCCAC ACTGACTTCA GOAAAACCCT AAAACTA3CT 1560 

ACAGATTACT ACTACGAATG AAT CAT'/ AAG TTTTGTGTCT GCAACAATTT AGAA3CACTA 152 0 

55 AGO C CAAAT A T-3AGGAAATG TGTGTATGAT GGAATTTTCT AGGACAAAAC AGAT2AAGAT 1680 
TAAAACAGGA TCAAGG ATT A ATGGTATAAA AATGGTCTAC TAAAACAGGA TCAA3GATTA " 1740" 

AAACAGGATC AAGGATTAAT GGTATAAAAA TCTCTACTGG TTACOGGGTG GCNG3GCCAT 1800 



60 



FUSJSCOC'O -WO. 985496 3 A 2 I 



WO 98/54963 



PCT/US98/1I422 



413 



ACAGGGTAGT GGTOTATGGA TAGTTTAGTT TGGNAAGGGT AA 



134 2 
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(2) INFORMATION FOR SEQ ID NO: 161: 

(i) sequence CHARACTERISTICS : 

(A) LENGTH: 77 0 base pairs 

(B) TYPE : nucleic acid 

( C ) STRANDEDNES Si do ub 1 e 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
GGCACGAGCC CTATGCTGTT CTTGTGATAA TGAGTGAGTC 
ATAGGCATCT GGCATTTCCC CTGCTGACGC TCATTCTCTA 
GTGTCTTCTG TCATGATTGT AAGTTTCCTG AGGCCTCCCC 
CAATTAAACC TCTTTTCTCT ATAAATTATC CAGTCTTATA 
AGAACAGATA ATACCGTAAA TTGGTATCAC AGAGAGTGGG 
GAAAATGTTA AAGCAAATTT GGAACTGGGT AAC AGGC AAA 
CAGTTAAGAA GAAGACAGGA AAATATGAGA AATCTTGAAA 
CTCAGAAGAC ATGAAGATGT GGGAAGCTTT GGAACTTCCT 
TTGACCAAAA TGCTGATAGT GATATGGACA ATGAAGTCCA 
ACATAAGAAG CTCGCTGGGA ACTTGAGTAA AGATCACTCT 
C^CTTTTTT CCTCTGCCCT AIAGATCTGT GGAAATCTGA 
GGTATCTGGC AG AAG AAAT A TCTAAGCGGC AAAACCTTCM 
GTTTGAAAAA TTTGCAGCCT GACNATGGGA GACCAAAGTT 



161: 
TCACAAGATC 
TCCTGCCACC 
AGCTATGTAG 
TATTTCTTCA 
GTGTTGCTAT 
GGCTGGAACA 
CTTCCTAGAG 
AGAGAC r rrGT 
GGCTGAGCTT 
TGCTAGGCAA 
ACCTGAGAGA 
AGAGGAAGCA 
AAACCCAATT 



TGGTGGTGTT 
CTGOOAAGAA 
AACTGTGAGC 
TAGCAGTGTG 
AAACACATCT 
GTTKGAAGAA 
TCTTAAAGGT 
TTGAATGGCT 
ATCCAGACAG 
AGAGACTGGT 
(SATGATTTAG 
GAO2ATAAA0 



60 
120 
180 
24 0 
300 
360 
420 
480 
540 
600 
660 
72 0 
770 



40 (2) INFORMATION FOR SEQ ID NO: 162: 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5-19 base pairs 
(P) TYPE: nucleic acid 
50 (C) STRANDEDNES S : double 

( D) TOPOLOGY : 1 inear 

fvp .^FO'TFNCE DESCRIPTION: SEQ ID NC. 



60 
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414 

ATG3AG3TGT TTGT3AAG3T TAAATG33AA <3ACATAAAGC ACTTAGCCCA 3A3CCAAGGA 24 0 

AATGCTGAAT AC-GATAATG3 TG3CCTCCTT TGGCG2TGTG 2 TG3T3C AGG TG PG3CGAGG 3 00 

5 AAYTG33CAG 3G3TGACAGA TACCTCTTCT AACCTAGTTC CTTTCCAAGA ACOTAATTG3 3 60 

TGTCTCTCCC T 33 3 2C AG 33 AAT2" 33.AAG 3 AGGAG32TG3 333 2 3AGCCO 2AOAATACG3 420 

GAG3TTTCTC AC O 3T' 3GTAG G3AAATTGCT G3GTTGG333 TGTG33CAAC CA3AGTGATC 4 80 

10 

3TCTCT3T33 AG 5 A 2G3ATG AGG3TTTGCT GACAGAG3C Sl'O 



15 

(2) INFORMATION FOR SEQ ID MO: 163: 

(ij SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7E3 base pairs 
20 (B) TYPE: nucleic acid 

< C ) STRANDEE'NESS : doub 1 e 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 

25 

GGCACGAG2G 1 3 OA 3 3 A 3C AG 03AGTTG3TG ACTGGCACAT GGCCTCCAGC GTCCCG3CTG 60 

GTG3G3ACAC TA3A3GCG3A G33ATCTTCT TAATTG3TAA ATTG3ATCTT 1 3AA 1 33TTC AC 12 0 

30 TGTTT AAATC TTTTCAGTG3 CTTCCCTTTG TACTTAGAAA AAAAT<5CAAC TTCTTCTGCT 18 0 

GGGACTCATC CG3TCACAG3 CTTCCCCTCC ACCCTCTCTC TGCCTCATGC TCTG2CCCTG 240 

CCTGC 2AT32 3T 33GATACT CACCTTTTGT ACCCCAG3AO CCGTGCCCTC TGC2CCTCGA 300 

35 

TCTTTG3CTG G2TG3TTGCT C3TCACTCAG TGTTCAG3AC AAATGCTCCT GG20GTACCC 3 60 

CATCTA3CCA GTCTAG3CCG GTCTTCCCTG TCTTCCCTGT TTCATTCATG GCTCTTATTG 420 

40 TTTGTTWA3T TGT3TG3TGT TGACTTTTAA CTCTCTCAGT CCCCACTGGA ATG2AAGCGA 4 80 

TCTCCCAAG3 TC3TAGAATT GTTCCTGCCT CTT3ACAG32 CCTTACGCTG TGT3TCCTCG 540 

TGCCGAATTC G3CACGAGGG TATGTG3ACT TGCTGGTATG TATGTAGGTG TTTGCTAACA 600 

45 

CATACGTGCA CACGCAGAAT GCTTCCAGGG GACTGCACAG CCT'2TAGTTC GCAGCCCCCA 660 

CCCCTCCCTT T33CCCTGCA CTCTCCCCTC TCTGAGCTG3 ATTQ3CATGA AAG3GTGCAN 720 

50 GGTTCCTGAN CCCG2NAGCG NCACCTCCT3 GGA 753 



55 (2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1400 base pairs 

(B) TYPE: nucleic acid 
60 fC) STRANDEDNESS : double 



BNSDOCID <WO ...98 54963 A 2 ! 
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10 



20 



30 



40 



50 



415 



{ D ) TO PC LCGY : 1 - n e a r 
(xi) SENTENCE DESCRIPTION: SEQ ID MO: 164: 
GGCACAGTTT ATTAATACCT A.TTATCCCIAA AGTCACTTTG GTTGGCATTG AAAATT AC AT 
C ATCTTT AAA GC AGT ATTTG TC CC C A.GA7G GACTCATCAC TAGCAAAGAC TAGGTTCATT 12 M 

GGAAGGCATA G^TGAGAGA ATGGGAAGAT G RA 1 3TGG AGG CGGGTTGTTA AAGTGCTGTC 



AGAAGTGAAA GGTATTTGCA AAGTAAGCTA CAAATGACCC ATAAATCTGT TAACAACAGT 
15 CCTTAATATG CAAAGATGAA AAACAAGCAT TACTGCTACC CAAAGGGAAC T03TGCTTGG 
TGATGTGCAG ATGGGGCTGT T3GTTAAGAG AG - T ATTAC A GGTITTCTCT CTTAGGTTTC ■ 
ATAGGAGGTA GTTA'OTGAGA TGAGATTGTT TTATCTTTTT GAATACAGAT CTCTTGTCTT 
GAGTTAGTTC 'PGAOGATGOG AGTAAT AAA 3 G AGTTTTTTG TTTTTTT3TT TGTTTGTTTG 
TTTTGGCTCC TTAGTAATAC TCCTCTGACA TTTATTTCTA TTATTCTTCA AAGAAAGGAA 

25 accaactgaa atgtttgctt taacaaacat tttaataagt TCTCTGGGTT TTTTTTTCCC 

CTTTTAAAAA AATTAGCATA TAG CAT AGO A ATAAAAGAAC T AATG TT AAC TATTGTATGC 



TCATAGTCAC ATT ^LAATTTG TGAAGGCCAA AGAAATTGAA GGGAGTCATA TTTTCATTTT 



TGGTTGAATG TGTTTTTGCT T AAAT AAAGC TTTTGGTATT TGTTTAAATO ACAAAAAAAA 
AAAAAAAAAA AAAAACTCGA 



6(i 



180 



AGTGAGTGAT TTTGTCTACT TGAA.TAA.TGG TCCATGTTTG GGGGC AT ATT GT 3 TTTCAT A 24 J 



3 00 

3 60 
420 

4 30 
54 J 
6C0 
66 0 
710 



TACAACTTAA GT3ATTTTTC T AAAJ3 AAGC A CAATGTCATT GRAAGTATTA TTGAAAAGGA 7 80 



84 0 



_ TT ,-,. rT ^^ TV ^^ mT .- n r^.Ti rr a c^C-jT AG AG TGTTTTACCC 9:0 



9 6 0 
1020 
1CH0 



35 ATGG AAAC AG GTTTC AGATT ACTTTGTTTT TACTGTTAGA GTCTCAAGTT TAGAAATGCT 
AACACTTAAA TCAGTTTTTT TCTCAC T AT A CTTGAAGATT GTT AAT ATTT T 3 AT AT CTT C 
C T AGCTTG AT GGAATTTAAA CATATCTTCA GATCTGT3AC AGTGACAGCC AAT AGG AC TG 
ATAATATTAG CTTCAAACCA ATAATATC 3 A GGGTTAAAAT AAAAAT2ATA GTGAAAGTAC 1140 
GATTGTAAAA TT ATGC TATA TTAACTTTTA A3TCT3TAA7 AA.CTTGACAT CAAAATGTTA 1200 
45 TGTAATTACC AT AAAT AATG GCTAGCGA3A AC ATC TTT \ ^ . - '-^^ ~. , i ^ ^ 

TT ACT AC ACT GTTTGCAGAA TGAATGTAGA AATGATCCTG TTAGC7TTCT GAATGTTCTG 1^0 



1380 
1400 



60 
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416 



(C) STRANDEDriECS : double 
CD) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 

5 

CAGGCCTCAG GGCCTCT3GT <3GC T2TGGCC C AG AC ACT AT TTOGAGTTCT TGTGCTATGG 60 

GTGOGAGTCT TCTTCrTCAA GTTTCGGCAG CTGTGCTGTG NCTGGATGGG CTGCTCCTCC 120 

10 CA3GGCTCAA GGGCTGTGGT CG3CTCAGGG TCTCATTTCC CGAGGCCAAG TTC AAGGCA G 180 

CAGCCCTTT j TGAGGCGGTC TTGGCCCTOG GCT3GAGGGA 'GAACTTTAAG CTTTTTTGCT 24 0 

CACAGGGACG TGGTATO3GC CCTGGGT3CA GGT'GCCCACA TTCTGCTAAT GAGAGCTTT3 3 00 

tctgatcagt cgt^ggtcca tcagtttgtc catgtgtccg octoccagcg CGTCCCTTGG 3 60 

GAT2CTTCCC 2T3G3GTGTA GCCTTGTTCA TTAGTATATA CTCATTCCTT CATGCTTTCC 420 

TCAGOAGAAC ACTTCCACTT CTGAGGTGAG CTTTTGCCCC RT^JC ZTTCC TC 2ACAGGT3 4 80 

TTGCCTTTTT ATAAAGACCT GAT AGO AG AA T AAATT-GGT j TTTCC 2TGTT 'GACCCAGCAC 54 0 

GATTTCTGTG > 33CCTAG AAT AT3GCCCTCA ACOGTTAGA 3 TO3Q3CAGT5 AGGGCTTGAG 600 

GAGTGACCCT TCCTTTCTGA TG3TTTTAGT i2ATTTTOGCT orCAGCCCTT AATGGCACAG 6-0 

ATCTGCTGCT TCTAACAGAT GG2CAGGAG3 TGAGACCGAT TTCAG2CATT tGCCAAGGTTA 72 0 

GCACCCTCTC CTTTGAGCCT AGGGCCACAC TGTTCATTGT CACTTTAGGG AAGTGCCTGT 73 0 

TTGGCTTTAA AG3TAAGC-2T GCCAG2TGTG AGAAGCGTTG GTAACTGATG 1 IACTCATTTC 84 0 

CTGGTCCTTA AAGATGCAGC CTCTTAAGGG CTCCTT'GATG GATGCCATCT CTCCTAGCCG 900 

CCAGCCCTGG TGCCACTGGT GQXLA3GTTC CCATTCTTTG (5G3CTGGGAG GGACAGCTTG 960 

CCTGTTTCTG GTCACAAATT ACAGTCTTGT '2TCCTGTACC ATTCTGTGGC TTCAGC ATQ 1 102 0 

GGGCAGTAGC CTTTCATTAG TGTAGATAGT CATTCCCTOJ TAGGGTGGAG GGTAAGACAT 108 0 

AGGGTCTGGA ACTC7TTTGGG ACCTTTTQ3G GATGTC'GTGT GCCTCCCAGA TTCCTMGATT 114 0 

CTGGGAGGAG AGGrTGCCGC ATTCTGCTGC TCCTCACAGC 'GAGCAAAGCT GCACCCACTT 1200 

ACATTCAGTA TTTTCCTGGC ACTACAAAGA 'GTOGGAAGGC CTGGGATTTG CTGCTGCTCC 1260 

CTTAGAGCAG GGCCCCTYTT TTCAGCACTT TGGACACCT j GAG AC C CAGC CCTGTTATTT 1320 

50 AATGGTAGTG GGCAAGTGTG TGTGC ATACT GTCTGCCACT GCTTTCTCCC TGCCCCATGC 1330 

CAGAGAGCCC TGTCCCTGCC AGGCCCAGCC TTCTTAGCCC <IAACTTGGGA ACAAAGTGCA 1440 

^ ACATGOGATC ATGGGTTGGG GTGCTCAGGT GAGCCCTCTC TATAGTGCTT CCCTGGGCCA 1500 

AGCTGACACC AGCCCCTGAG GGTGGGGTGG GACGGGTGGT GCTTAAAAGA GGAAGGGGAC 1560 

CAGTGTAGCA ACTTGCCAGG GACCCCACCC CTCCCTCTCT GGGCC TGTGC AGTGAGCATG 1620 

60 GGGATTCCCA TCAAGGGGCC TGGCACCTGT GCTAGTTACG TAGCCGCTGN TCACGCGCTC 1680 



15 



20 



25 



30 



35 



40 



45 
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417 



ACT02TGACC ACATGC AC 3 T TCCCTAGATG CAGACTGCTT TGAACTTTAA AGCTGTACAA 174 0 

T TTGGTTATG TTTGT3CT3A CTTAAAATAT ATTTTAATGA GGAAAAAATA ATGGAGAACC 1300 

5 

CTG3GAACGA CCTGGTTCTT TTGCTTCTC3 GGGAACTGTA A3CCCTCGCG TTCT33GAAT I86 0 

OyjMTCT^: TGCTCTTTCC TGGAAGCTAA GCCTGTCTCC A2CGCCCGAG GCCTGCGCCG 192 3 

10 GT3CTCC03C CGCAGTTGCG TTT3CTTTG3 ACCITGCGTG GG2,GGGA3GG GGTGCTC 3GT 1980 

CCGA3CCC32 TC 2TTTCTGT ACACCTAGCG CTGCCCGCCC CG2TTGT3TC TGAGGTCGT3 2 04 0 

TAT3TCAAAA ATAAAGCCGC TAGAAACGGA AAAAAAAAAA AAAAAAAAAA AAAA AAA. AAA 2100 

15 

AAACTCGAGG GQ3GGCC 3GT ACC 2AATTAA CCCMNTATGA TCT ATAAA GO GTC 2 IS 3 



30 



50 



(2) I MFC F MATIOM FOP SEQ ID NO: 166: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1251 base pairs 
25 (B) TYPE: nucleic arid 

(C) STRAtJDEDNEGS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

CtCCC ACGCGT CCGCCCACGC GTCCGGCGGT GCGGAGTATG GGCCGCTGAT GCrCCATGGAG 60 

-:xx:tactggc GCTTcnr/^: gc-tgctgckxj t2Ggcactgc tcgtcggcit cctgtcggtg 120 

35 atcttcgccc tcgtctgggt cctccactac cgagaggggg ttggctgoga t:;ggagcgca 18 0 

CTAGAGTTTA ACT3GCACCC AGTGCT3ATG GTCACCGCCT TCGTCTTCAT CCA3GGCATC 24 0 

^TATC , AT ( ' , G TCTACAGACT GCCGTGGACC TGGAAATGCA (JCAAGCTCCT GATGAAATCC 300 

40 

ATCCATGCAG GGTTAAAT3C AGTTGCTGCC ATTCTTGCAA TTATCTCTGT GGT3GCCGTG 36 0 

TTTGAGAA.:C ACAATGTTAA CAATATAOrC AATAT 5TACA GTCTGCAGAG CTGGGTTGGA 420 

45 GT AT AO 3TG TGATAT02TA TIT GIT AC AG '-'TrT'.'A.; 32TTTTCAGT CTTT C T 2h2 TT 4^0 

::ALTjGG:rC C32TTTCTCT CC2AGCATrT CTCAT03CCA 2ACATGTTTA TTCT3GAACT S4 0 

3T3ATC1TTG GAACAGTGAT TGCAACA3CA .0TTATG3GAT TGACAGA 3AA A2TGATTTTT 6 00 

TCC3T3AGAG ATCCT3CATA CAGTACATTC CCGCCAGAAG GTGTTTT3GT AAATACGCTT 660 

OGCCTTCT3A TCCTOGTGTT CG3GGCCCTC ATTTTTTGCA TAGTCACCAG A 2 C GO AATGG /2 0 



60 
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ACCATGTAAA ATGTTGTAGA GATAGAGCCA TATAACGTCA CGTTTCAAAA CTAGCTCTAC 06 C 

AGTTTTGCTT CTCCTATTAG CCATATGATA ATTf JGGCTAT GTAGTATCAA TATTTACTTT IC2 0 

5 AATCACAAAG GATGGTTTCT TC-AAATAATT TGTATTGATT GAGGCOTATG A^CTGACCTG 1080 

AATTGGAAAG GATGTGATTA ATATAAATAA TAGCAGATAT AAATTGTGGT TATGTTACCT 114 0 

TTATCTTGTT GA3GACCACA ACATTAGCAC GGTGCCTTGT G*^AKAATAGA TACTCAATAT 12 00 

10 

GTGAATATGT GTCTACTAGT A- .7TTAATTGG AT AAACTGG C ASCATCCCTG A 1251 



15 

(2) INFORMATION FOR SEQ ID NO: 167: 

( i i SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 882 base pairs 
20 (B) TYPE : nucleic acid 

(C) STRANDEINESS : double 

( D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ_ ID NO: 167: 

25 

g ac smtcta- j aactatggtc ccccgggact gc a '3gaattc ggcacagcgg ctgcgggcgc fio 

gaggtgag^g g03cgaggtt ccca3cagga tgc 2ccggct ctgca'sgaag ctgaagtgag 12 0 

30 aggcccggag ag3gcccagc ccgcccgggg cag.5atgacc aag.^cccggc tgttccggct 180 

gtggctg3tg ct3gg3tcgg tgttcatgat cctgctgatc atcgtgtact g3gaca303c 2 40 

aco:g:o>:j cacttctact t^gacacgtc cttctctagj cogia::acgg goccgccgct 300 

35 

0202ACG:2 2 GG>2CG^ACA GOSACAOSGA GCTCACGG2C GAYTCCGATG TIG AO .lAKTT 3 60 

TCT<GGACAAK TTTi 2TC AGTG CT3GCGTGAA GCA-^AGTGAC YTTCCCAGAA A 3G AG ACG 3A 420 

40 GCAGCCGCCT GIGCCGjGGA G2ATSGAGGA GA< GCGTGAG A RGCTACGACT GGTCCCCGCG 480 

CGAMGCCOSG CGCACCCAGA CCAGGGCCGG CAGCARGOSG ANCG3AGGAR CGTGCTGCGG 540 

GGCTTCTGCG OZAAYTCCAG CCTGGCCTTC CCCACCAAGS AGCG03CATT CRACGACATC 60 0 

45 

CCCAA2TOSG AGCTGASCCA GCTGATCGTG '3ACGACCG3C ACG3G3CCAT CTACT OCT AC 660 

GTGCCCAAGG T3GCCTGCAC CAACTGGAAG CG2GTRATGA TCGTGCTGAG CGGAASCTGT 720 

50 GCACOSCGTG C3CCTACCGC GACCCGYTG2 GNTCCCGCGC GAGCACGTGC ACAAOOCCAG 730 

CGC GCACTG A CTT CAACAAT TCTGGCGCCG CTACGGGAAG TCTCCCCCAC CTCATGAAGT 84 0 

CAAGCTCAAG AAT AC AC C AA TTCTTTCTGC GCGACCCTTC TG 832 

55 



(2) INFORMATION FOR SEQ ID NO: 168: 

60- 
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( i ) SEQUENCE CHARACTER I ST I C S : 

(A) LENGTH: 12 08 base pairs 

(B) TYPE: nucleic acid 

(C) STRANIEENESS : double 
5 ( D) TOPOLCGY : linear 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 

GOGAAACTCA AAAGGATGAT ggaatggttg atggagccag AGCCTAGAAG TRAAGGGATA 6 0 

10 

CAGAGTGAAG AT AG AO jT AT TT AC "LIT AT AT TTW AAT ATT A G:TTTCGAAT TACGTAGGGA 10 0 

TTCTTAAGAA AAGATCATGA CA3GACAGCC ACATTTGGTA AAATGTCAC3G GCAGCCAGTG 100 

15 CATGGTCCTC CTGGOGCTCC TCAGTTGACG GGTTTAAATC ATTTCCTGAT CCCCCTGCCC 24 (I 

TGGTTTSAGG AAT C >' LT AT AC A GT AC GTGAAA TGC CTGTGGT ATGAGTTGCA ATGGGCAATC- LOO 

AACCTGGGTA AATCCAAGAT TAATGATTAG TT CTAAAG AT CCAGTTGAAG TTCTAGAGTG 'MIO 

20 

GGAATTTTCC GTCAAGCARC TCAGCACAGO TTTATGCCTG TTCCTCTAAT AACGATAGGT 42 0 

AACAAATAiX: TGTGTKTWOA CAGCTAGGAR GATAACCAAA TCTAGAGTTC TTGARTCTCA 480 

25 TTTAATAAAT AAK T ATT ATG AGTACCAACT GCATATTTCA G3CA2TGCAT TTGACTCTGT r >40 

TAAATACTOA Ti'C'LTTTAKGA CMSCCACWTC AGAWAACMTT AATCTGTCTG ATCAATAAAC 60 1 

agcttgactt aga.-;rggt;^a AATAGCTTGC CACAGGTWAC CCAATTAGTA GGTAACAGCG '">0 

30 

ACAGAATAAC AGTGCAGTTA AAATCTTA OA CTGGAG AC T A ATTGCATAAG TTTGAATTTC 7 2 0 

AGTTCTGCTA TGTAAATTTG I3GTGAGTACC TTAATTYACC TCAGTCTCGG TCTTTATATC ISO 

35 TGTAGAATGG AGCTAATGAT ATTACTTAAT TT GCTTTAT G TGAGATTAAA TGTACTAATA ?A0 

TATGTAAA.TC ACTTACAA<:A 'OCATTTGACA TATTTGACAT ACTTAATATA TTTGCTACTA :»0D 

ATACTATTAG C AAO" AGC ATT CTGATTTTCC AAGTTGAAAT TCAGTGTTTT CTTTTTTACT ^60 

40 

TTGC CAT AAT TTACAATGTT GTGCTCTGTA AAC C AT AAAT TTCCCTGAGG TGTTGTCAGG 102 0 

TTAAAAAAAA ATCACTATGG CCCCCARHMA CTTGGAAAAT AGAAATGAGA CCAGCTTCAT 10 0 0 

45 CTATATTCTT TACTGCAAAT AACTTAGAAT TGTAATAGGC T AA T ATGT AC TGGGACTTCC 1.14 0 

AATTTGGGAA TATGAC AAAA AT AAT ACT A' V T TAG." .' T AA.AA CATATACV1A ACTTATTTTT 1^00 



CCTCTGAA 

50 



60 



TOCOLOGY: lino 
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15 



35 



40 



45 



50 



55 



60 



420 



(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 
GGCACGAGAG AAAA3AGGTT GAGAATGTTT TCTAGCAGGC AGAATGTGCA TACATGTTTT 
CATGAKTGTC CTTTGGGTGC TGTTTCTTTT AAATCCTCTG TGCACAGGGC TCTGGCCTTT 
ARTAAACTGT TTTTCTGTCT TACGTCATSC TGACTGGGTG CTAGGGGCTG ATTACAAAGG 
GGAAGAGTTG AACAGACATC AGGGGCCGAT GAAACCAAAG GACTA3GA:;T CAGGAGAACA 
AGTCAGGGAT TAGGAGACAG CG3TTTGGTT T ATTGTT AT C CAGCTGGA-3G ACTCCTAGGG 
GCAGCAGCAG GAGGAATACC AGGGCCACGG AGGGGCAGGA GTCTCACA3T GG AG 3GCAGA 



130 
24 0 
300 

360 



CTCTAACAGA T-3CCAGCTGA ACGCTCGCTG GCCCTGGATG TCATACGA7T TGGGGACCAG 42 0 



AAATCT'SGGC TCAGAGAACC CGTCCAGGGA 1 GATTTG AAGC CATGGGTT A T CTTCTAt^AGT 4 SO 

20 TGATACT 3 AT AATATATTTT AATTTTTATT GATGTTTAAT ACCTTCTGAA ACAGGAGGGT 54 0 

AAGATCAGAT GGGAAGCCCY TCTGTTGAAG GAT CTTGGG A ACCTTGGTGG TTTTTTTTTT 600 

^ TTGGTTTTTT TTTTTTTGAT CGAGCTGTGG ACATCCTTCT TAATTCGATT NTGAGGATTT 660 

GTTTAACTAA AAAGTTCC C A AACACAGAAA GGGCCTCCC2 AC "TGCTT73 GGGAGCTGTC 72 0 

TGT SCTGGG A GTGCCAGGCA TCCSATGGGA CCCATCACTG CC AGTGTCTG TSCCTCCCAG 78 0 

30 AGGTCA.5CCC TGTGTCTGCC CTGGCTCTGT '2TCCTCTGTG ACAGGGCAGA GCATTTCTGG 34 0 

TCAGTTTCTC CATGGTGCCT CCCACCCCTT TGTAAAGTGG ATGGACATGA TGGAATTCAG 900 

TTGTCTCACC CTGATAGOCT GGGTGTTGAT ATTCACTTTA CCOGCACTCA GACACAGGCG 96C 

ACCTTGAAGC AGTTCTCG jT GTGTAGAGTC CACGTGACAG TCCCCACAGC CTCCCCAGAT 102 0 

AGCTGTGTGC CTGTGCGCTA CTGCTGTGCC ATTTTCCCAA CTTNGGCGTT TCACTAAATG 1080 



CAGCTGATCT CTCTCTCTGT GCACTCGTGA TCCATGTTGA ACAATACATG TAGGTTCTTT 114 0 

TTCCACGCAA TGTAAGAACA TGATATACTG TACGTTGGAA AGO ATTTAC C TTATTTATAT 1200 

ACCTGAATGT TCCTACTACA CAAATAAACA TATATTAAAT WCTAAAAAAA AAAAAAAAAA 1260 

CTGGAGGGGG GGCCCGGTAC CCAAATCGCC GG AT AGTG AT CGTAAAC 1307 

(2) INFORMATION FOR SEQ ID NO: 17 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1624 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNES S : doub 1 e 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 
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10 



50 



F,0 



GGCACGAGGT CGCCGCCGIG GCC3CCTGGA ATT3TGGGAG TTGTGTCTGC C AGT G ■ 3GC TG 
GGGGAGGCGA AGGTCCCTGA CTATG3CTCC CCA3AG3CTG CCTTCATCTA GGAT03CTCC 12 0 

TCTCGGCATG CTG3TTGG3C TG3TGATG3C CGCCTG3TTC A 3CTTCTGCC TGAGTGATGA 180 
GAACCTGAAG GACTTTG3CC TGArGAACCC AGAGAAGAG3 A3CACCAAAG AAACFGAGAG 24 0 

AAAAGAAACC AAAGCCGAGG AG3AGCTGGA TGCCGAAGTC CTGGAGGTGT TCCACCCGAC 3 00 

GCATGAGTGG CAGGCCCTTC AG-C C AGGG7 A GGCTGTCCCT OGAGGATCCC ACGTACGG3T 3 6,0 

GAATCTTCAG ACTG3G3AAA G AG A G G GAAA ACTCCAATAT GAG3ACAAGT TCCGAAATAA 42 0 

15 TTTG AAA 1 GGC AAAAGG:TG3 ATATCAACAC CAACA2CTAC ACATCTCAG3 ATCTCAAGAG 
TGCACTOGCA AAATTCAAGG AQ3GC-GCAGA GATG3AGAGT TGAAAGGAAG ACAAGGCAA 3 
GG\GGCTGAG GTAAAG3G3C TCTTCCGC3C CATT3AGGAA CTGAAGAAAG AC TTTG ATG A 

20 

GCTG AATGTT (3 TC ATT- 3 AG A CTGACATCCA GATCATG3TA OG3CTGATCA ACAAGTTCAA 
TAGTTCCAGC TCCAGTTTG3 A A 1 3 AG AAG AT TGCTGCGCTC TTTGATCTTG AAT ATT ATG T 720 
25 CCATCAGATG GAG'AAT'GC G3 ACJGA 3CTGCT TTCCTTT3GT GGTCTTCAA3 TGGT3ATCAA 
TGGGCTGAAC AG JAG A GAGC CC 3TC GTGAA GGAGTATGCT GCGTTTGTCG TOUOTGCIKX: 
CTTTTCCAGC AACCCCAAG3 TCCAGGTGGA GGCCATCGAA GGGGGA3CCC TGCAGAAGCT 900 

30 

GCTGGTCATC ctgc-ccacc-g AGCAGCCGCT CACTCCAAAG AAGAAGGTCC TGTTTGCACT 960 
_______ -.^ ^-.v^afTV"^ r-r-Ar^j-^r'Ar; r T*rr'r r rr;AArr* TY^AGGGGGCT 1020 

35 GCAGGTCCTG ACGACCCTCG TC-CAGGAGAA <3GG 3ACGG AG GTGCT03CCG TGCGGGTGGT 1080 
CACACTG3TC TACGAGGTGG TCAG3GAGAA GATGTTCGCC GAQ3AGGAGG CTGAGCTGAC 114 0 

C CAGG AGATG TCCCCAGAGA AGCTGCAGCA GTATG'3CCAG GTA3ACCTCG TGCCAGG3CT 1200 

40 

GTGGGAACAG CGCTGGTGCG AGATCACG3C GCAGGTCCTG GCGCTGCCCG AGGATGAT3C 1260 
CCGTGAGAAG GTOrrGCAGA CA:TG3G3GT CCTCCTGACC AOCTGCCGGG ACCGCTAIJG 1320 
45 TCA3GACCCC CAGCT' ^30 ~A GGA3ACTC3C GAG:GTGGAG G:TGAGTACC AGG-3CTG-3C 13 B0 

Z AG3C T 1 G 3 AG CT 1 3CA- * 3ATG G'lG A 3G A 3 3A GG 3G T AG TTC : G\GGA ; 3CT GG TGG £ TCT ■ j V 14-* 0 

CAACAGCTTG CTGAA3GAGC T ( 3AGATGAGG CCGCAGACCA GGACTGGACT GGGATGCC3C 
TAGTGAGGCT GAGGG3P3CC A3CCTGG3TG GGCTTCTCAG G3AGGAGGAC AT CTTGGC AG 
TGCT'GGC^TG G^CATTAAAT G3AAACCTGA AGC<TCAAAAA AAAAAAAAAA AAAJV^AAAAA 



430 
540 
600 

660 



780 
840 



1500 
1S60 
1620 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2003 base pairs 
( h ) TYPE: nucleic acid 
5 ( C ) STRANDEDNESS : double 

<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 

10 GGCACGAGCC AGCTT3CAG3 AGGAATCGGT GAGGTCCTGT CCTGAGGCTG STGTCCSQGG 60 

CCGGTG3CTG CCCTCAAGST CCCTTCCCTA GCTGCTGCGG TTGCCATTO^ TTCTTGCCTG 120 

TTCTGGGAT:: AG3CACCT3G ATT3AGTTGC ACAGCTTTGC TTTATCCGG3 CTTGTGTG'GA 180 

15 

GGGCCGGGITT GGGmcr-A T:TG3ACATC CT'GAGG AC A 1 3 AAAAAGCTG3 GTCTTG3TGT 240 

G3CCTCC:AG G3TTAGTGTT 3CCTCCCTCA AAGACTGACA G2CATCGTTC TGCACGGGGC 3 00 

20 TTrCTG3ATG TGA7 3C3AG:: TAA3CATAGT AAGAAGTCCA GCCTAGGAA 3 i^GAAG^ATTT 360 

T-53A03TAjj TGG:TTT03T GACAC A 7rCA CTTCTTTCTC AGCCTCCAG3 ACACTATQGC 420 

CTGTTTTAA3 AGACYTCTTA TTTTTCTAAA GGTGAATTCT CAGATGATAG 3TG AAC CTGA 480 

25 

GTTGCAGATA TAG 3AACTTC TGCTTGTATT TCTTAAATGA CAAAGATTAG CTAG3TAAGA. 54 0 

AACTTCCTAG G3AAITAG3G AACCTATGTG TT C CCTCAG T GTG3TTTCCT GAAG3CAGTG 600 

30 ATATGGG3GT TAG3ATAG3A A'GAACTTTCT COGTAATGAT AAG3AGAATC TCTTGTTTCC 660 

TCCCACCTGT GTTGTAAAGA TAAACTGACG ATATACAG3C ACATTATGTA AACATACACA 720 

G3CAATGAAA 3C3AA3CTTG GCG3CCTGGG CGTGGTCTTG CAAAATG3TT CCAAAGCCA3 780 

35 

CTTAG2CTGT TrTATTCA3C GGCAACCCCA AAG3ACCTGT TAAGACTGCT GACGGCCAAG 34 0 

TO3CATGCA3 ITCC : TATO" CACCGG3ACC TGGTCAGCAC AGATCTTGAT GACTTCCCTT 900 

40 TCTAGGGCAG ACTG3GAGG3 TATCCAGGAA TCGGCCCCTG CCCCACGGGC GTTTT CATG3 960 

TGTACAGTGA CCTAAAGTT3 GTAAGATGTC ATAATGGACC AGTCCATGTG ATTTCAGTAT 1020 

ATACAACTC 2 ACGAGACCGC TCCAACCCAT ATAACACCC3 ACCCCTGTTC GCTTCCTGTA 1080 

45 

TGGTGAT AT C ATAT3T.AA3A TTTACTCCTG TTTCTGCTGA TTGTTTTTTT AATGTTTTGG 114 0 

TTTGTTTTTG ACATCAGCTG TAATCATTCC TGTGCTGTGT TTTTTATTAC CCTTGGTAGG 1200 

50 T ATT AG AG TT G3ACTTTTTT AAAAAAAGGT TTCTGCATCG TGGAAGCATT TGACCCAGAG 1260 

TGGAACGCGT GGCCTATGCA GGTGGATTCC TTCAGGTCTT TCCTTTGGTT CTTTGAGCAT 1320 

CTTTGCTTTC ATTCGTCTCC CGTCTTTGGT TCTCCAGTTC AAATTATTG 2 AAAGTAAAGG 1380 

55 

ATCTTTGAGT AGGTTCGGTC TGAAAGGTGT GGCCTTTATA TTTGATC C AC ACACGTTGGT 1440 

CTTTTAACCG TGCTGAGCAG AAAACAAAAC AGGTTAAGAA GAGCC GGGT 3 GCAGCTGACA 1500 

60 GAGGAAG3CG CTCAAATACC TTCACAATAA ATAGTGGCAA TATATATATA GTTTAAGAAG 1560 
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GCTCTCCATT TGGCATCGTT TAATTTATAT GTTATGTTCT AAGCACAGCT CTCTTCTCCT 162 0 

ATTTTCATCC TGCAAOZAAC TCAAAATATT TAAAATAAAG TTTACATTGT AGTTATTTTC ISS'J 

5 

AAATCTTTG3 TT3ATAAGTA TTAAGAAATA TTGGACTTGC r ?^:CGT.VJT TAAAGCTCTG 174' 

TTGATTTTGT TTCCGTTTGG ATTTTTGGGG GAGGGGAQ3A CTGTGTTTAT GCTGGAATAT 180" i 

10 GAAGTCTGAG AC3TTCCGGT GCTG3GAACA CACAAGAGTT GTT3AAAGTT GAC AA GCAGA 136''.' 

CTGCGCATGT CTGTGATGCT TTG T ATC ATT GTTGAGCAAT CGCTCGGTCC GTGGACAATA 192 0 

AA I AGT ATT A TCAAAGAGAA AAAAAAAAAA AAAAAACTCG NGGG3GQ3CC CGGTACCCAA 19R-) 
TTCGCCCTAT AGTGAGCCNA TTC 



15 



20 



30 



40 



50 



LATr iLTL'i A CO-G^TG.A ^TCCAi x 2A" 
35 AAATTACTTT CTT AGTTTTC TTCATTAAAA ACTAAGAAAA TGCTTTGTTT ATTATGAATT 
GCTATTTCTC TTG ATT ATT A TTCTTGGAGA AAGTCTATGA GACGTAATTC TTCT3ATTTG 
CTTCTAGGCT a<3aggaaaat GTGAAAGATG ACAAATGAAA ATTTCAAAGG TTGTCAGTAG 
TATGACTTCT TTTATCGTTT GTCATTATCA CAAATATATC AACATAGGAC ttttaaaaga 
TATTTTGTAC AT ATTO }(X G TTAGTAGGAT TTTGCATGAA TTTTTTTTTT CTTTFATGC 2 
45 C AG AG AGA AA GAGCAAAGAA ATAACCAAGG GTGATGTACT ! 3' 3 T ATTG AAG GTTTACCAAA 
TAAGGACT GG TTTTATTATG AACTATAGTC TATATTCTAA 1 3TAAATCAAT TTTTC T ATT A 
T3TGTTTTTT GTTCCT3CAG GCAAGATCTC TGAACTTTAT 'GZAGAGGGTT CTTTTAAAAA 
AAG AAAGT TG AATTTTTTTA TTTCTTGGAA TATTTTTTTT CATrGATTTC TCCCAAGTAG 
AGO AG ATTG A AATCTC '~TTT GTAGCGTATG TCTTTTTTGT TTTGCTATTA GCT GAGT ATT 



200 i 



€0 



(2) INFORMATION FCR SEQ ID NO: 17 2: 

( i ) SEQUENCE CHARACTERI ST ICS : 

(A) LENGTH: 7 86 base pairs 
(3) TYPE: nucleic acid 
{ C ) STRANDEDNESS : double 
{ D ) TOPOLOGY : 1 x near 

(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 17 2: 

GGCACAGCGG CAQ3AGAAGA CTTTGGTGTT T AAG AG ATT A ATGTGTTAGC CAGAACAACT 

T^TrTAT^nd AATAATTTTG 12 0 

IRQ 

240 

3 00 

3 60 

4.3) 

4---0 

34 ) 

600 

66 3 

72 0 
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{2} IHFORKATICN FOR SEQ ID NO: 17 3: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 17 58 base pairs 

(B) TYPE: nucleic acid 

(C) STRANTEENESS : double 

(D) TOPCLOGY: Linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 

GGGACGAGCC CTGOCCACCT CCTOCAGOCT CCTGCGCCOC (3CCGAGCTGG CGGATG3AGC 60 

TGCGCACGGG GAG3GTGG03 AGOCAGOCGG TGGCGCGGAG GATGGATGGG GACAGCCGAG 120 

15 

ATGGOGG03G CG3CAA3GA3 ■GCCACCG03T C3GA3GACTA CGAGAACCT3 CCGACTAGCG 18' 1 

CCTCCGTGTC CAC:CACAT3 ACAGCAGGAG CGATG02CG 3 '3ATCCTO3AG CACTCGGTCA 240 

20 TGTACCCGGT GGACTC GGTG AAGACACGAA TGCAGAGTTT 3AGTCCAGAT CCC AAA 3CCO 300 

AGT A 1 2ACAA 1 3 TATCTAOGGA GCGCTCAAGA AAATC A IGC G <3AC3GA^3CT TCTGGA3GCO 360 

CTTGCGAGGC GTCAACGTCA TGATC ATGOG TGCAGGGCCR GCC-3ATGCCA TGTATTTTG ; 420 

25 

CTGCT ATG A-\ AACATGAAAA G^'TTTTAAA TGACGTTTT 3 CAC IACOAAG GAAACAG3CA 480 

CCTAGCCAAC GGTATTTTGA AA3GGTTTGT CTGGAGTTAG AAAGTTCTCT TCTTCAACAO 54 0 

30 GTCCCTCCCC AGGGTGTTCC TCCCTGTGAC CCAGCCGCCT CGACTTCGGC CCGCTTGCTC 60'') 

ACGAAT AAA* 5 AACTCAGAGT TGT , 3TGT3CA ATGCACACCC AGA3ACAC3C ACGCACACAC 660 

ACGCGCGCGC ACACACATGC TTTTTTCTGT TCCCCTCCGC TTTCTGAAGC CTGGGGAGAA 72 0 

35 

ATCAGTGACA GAGGTGTTTT ( 3* 3TTTT ATTG TTATGTGGGT TTTCTTTT(3T ATTTTTTTTG 780 

TTTGTTTTGT TTTTAAACAT T '3AAAAGC AA TT AATG AT C A GACATAGGAG AAACCCTGAA 84 0 

40 TAGAAACAAA ACTTTTGAAT '3CTGGATTCA AAAAAAAAAA AAAGTTATCT GGACAGCTTC 900 

TTT< 3 AG ACT A TTTAAAAACT GGTACAACAG GTCTCTACAA CGCCAAGATC TAACTAAGCT 960 

TTAAAAGGTC AAGAAGTTTT AT03CTGACA AAGGACTCGC GCAACGCAGA AGGCCTTTCC 102 0 

45 

CACCTTAAG3 TTCCGGGGAT CT03GAATTT TACCCCCATT CTGTTCTGTT TGTCTGAGTC 108 0 

TCATCTCTCT GCAAGCAAGG GCTGAAATCA TTTTGTTTGG TTGTTTTGAG GGAGAGAGGC 114 0 

50 GGGGTGGGGG GGTGCAAATC TGCCAGCA3C TCTTACGTAA G03ATGTTTT ATTGGGGAGG 1200 

GCTGAGCTTT TATTTTCTCC T CTCCAGTGG (3GTTGGCTTT T ATTGTTT CT TGTTTGGGTT 1260 

TGGAATGGAA ATATGGATAG CAGCATAAAG TACTTTTATT TTGACAAAAT T3ATTTTTTT 1320 

55 

CAACAATGGA GACATAGATT TGACCCACAA T AACTTCT C C CCCTCTCTTT TTACTCTGCT 1380 

CAAAAAGCAT CTCTCCTCCC ATTACCCAAC '3TTGGTCATA AG TGTGCCTG GCTGGTTTGC 1440 

60 AGATATTTGT TCTGCTTTGT AAAAATTGGC CATT AGTGC A TTTATTGAGA TGATCTCTAA 1500 
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50 



AMA.^P. AAAFA AFAAA 3GAG A 



AGAGCTATGC CCTGACCTAC CCCTGATTCT ATGACATTGG G3CC:TTCTT TTGCTGAAAC 15 60 

T'3CCTTACGT A\T03TTTTA CTCCTTGAAA (3AGATTTGAC GGAYTCCATT TTATGCCAAG 162 0 

T 3CTGCC 3TG C AC TG TTT C T G3AATATGTG GT3TATGCT3 TOO T3 AT C TT GCTGGGAATG 1680 

ATT AT AA 3 TG TGTGTGTGGT GG3GGAGTG3 GT ATT AC AT 3 CAVTTOCTC-AA GAGTCAAAAA 17 4 0 

10 AAAAAAAAAA AAACT03A 17 5 B 



15 (2) INFORMATION FOP SEQ ID NO : 17 4: 

'. i ) : : EQUENC E CHAPACTE F ISTICS : 

(A) LENGTH: 8 8 8 base pairs 
(3) TYPE, nucleic acid 
20 (C) STRAI-JDEDNEGS : double 

{ D ) TO POLO 3 Y : linear 

(xi) SEQUENCE DESIPIPTIOM: SFQ ID NO: 174: 

25 CTGTTAGAAT GCCCAGTTTA CCTGGATGGC AACCCAACAG TGCTCCTGCC CACCTGCCCC 60 

TCAATCCTCC TAGAATTCAG CCCCCAATTG CCCAGTTACC P AT AAAAAC T TGTACACCAG 120 

COTCAGGGAC AGT^TCAAAT GCAAATCCAC AGAGTGASMC ACCACCTCGG GTAGAATTTG 180 

30 

ATGACAACAA T C CCTTT AGT G AAAGTTTT C AAGAACGGGA ACGTAAGGAA CGTTTACGAG 24 0 

AA~AGCAAGA GAGACAA^GG ATOCAA'~TC 'A TGCAGGAGGT AG AT AG AC AA AG AGC TTTGC 3 00 

35 AGC AG AG 3 AT '3GAAATG3AG CAGCAT'IXSTA TGGTGGGCTC TG A< 3ATAAGT AGTAGTAGGA 3 60 

CATCTGTGTC CCA.ATTCCC TTCTACAGTT CCGACTTACC TTGTG ATTTT ATGCAACCTC 420 

TA3GACCCCT T<2A3CA<3TCT CCACAACACC AA7AGCAAAT GGG3CAGGTT TTACAGCAGC 4 80 

AGAATATAC A ACAAGGATCA ATTAATTCAC CCTCCACCCA AACTTTCATG CAG ACT AATG 5 40 

AGCGA3GCAG GTA3GOCCTC 3TTCATTTGT TCCTGATTCA CCATCAATCC CTGTTGGAAG 603 

45 0 C C AAATTTT TCTTCTGTGA AGrCAGGCA 2 A TGCAAATCTT TCTGGCACCA OCTTCCAGCA 66 3 

3TCCCCAGTG AGG2CTTCTT TTACACCT'3C TTT AC C AGC A GCACCTCCAG TAGCTAATAG 72: 
CAGTCTCCCA TGT :3G> 2 CAA G ATTCTACTAT AACCCATGGA CACAGTTATC CGGGATCAAC 
3CAATCGCTC ATTCAGTTGT ATTCTGATAT AATCCCAGAG GAAAAAGGGN AAAAAAAARA 



730 
840 



'A G AATTCCAC C AAGGCTCC 33 8 



60 
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(A) LENGTH: 7.379 base pairs 
(E) TYPK: nucleic acid 
t •: ) STRAirCEIXIEGS : double 
(D) TOPOLOGY: linear 

5 

ixi) SECUENCE rESCRIPTICN: SEC IT) MO: 175: 

GGCAGAGCTA GTGT3GACTC CATCC7CCTG GAGTGGGATC ACGN 7TATGA CCTCAGTCGG 6 0 

10 GACCTGGAGT CT3CAATOTC 7AGA3CTCTG CC7T7T7AG3 ATGAAGAAGG TCAG3AT3A7 12 0 

AAAGATTTCT ACCTCCGGGG AG2TGTTGS 7 TTAT2AGGGG ACCACAGT3C CCTAGAGTCA 13 0 

CAGATCC7AC AA7T3GGCAA A G3CTG3ATG ATAGCOTCTT T 7AGATAG AG 7AAA 7 ~GAAA 2 10 

15 

AT AT G ATT 7G OAGCAAAAOT CC7ACG3GG7 CGGA 77TAGA CAC7AGCTAC AAAG37TACA 3 00 

TGA.\^:iS7T 3O0CGAATO7 AGTAGCAGTA TAGA7TC7GT G AA' TAG ACT 3 GAGCACAAAC 3 60 

20 TG AAO 7 A 1 3 3 A AGAG7AGAGC CTTCCT33CT TT7TTAA77T O 7 ATAGT A 7G GAAACCCAAA 4.10 

03G7T3GT3T 7 ATT 3 AC C G A TGG3AG7TT7 TC7AG3C77A G7CATTGAG3 AA' 73 A 7TTGA 46 0 

■ 7 3 AT 7 AA 1 3 7 A GAACCTCCAG AA3TGGCAG7 A 3TTTAA7TC AGACTTGAA7 AGCATCT3<33 54 0 

25 

OCTOTCTGGG GGA7ACG3AG GAGGAGTTG3 AACAG3T7 7A G-7GTCTGGAA CT7AGCA7TG 60 0 

ACATCCAGA3 OATC3AG7TC 7AGATCAAAA AGCTCAAGGA 37T7CA3AAA G2TGT3GACC 66 0 

30 ACCGO\AAG7 7ATCATCCTC TC:ATGAAT: T7TG7AG77C TGAGTTCA7 7 <2AGGCT3ACA 77 0 

GCAAGGAGAG CCG33ACCTG 7A3GATCGCT T3T3G7AGAT GAATGQ33 77 T303ACOGAG 7 0 

TGTGCTCTCT GCTG jAG3A< j TGGGGGGGC 7 T37T3CAG3A TX:72T'3AT3 CAGTGCCAG3 840 

GTTT7CATGA AATGA3CCAT 03TTTGCTT7 TTATG7T33A GAACATTGAC AGAAQ3AAAA 90 0 

AT3AAATTGT GCCTATT ( 3AT TCTAA7CTTG ATGCAGAGAT ACTTCA<3GA 7 2ATCACAAAC 96 0 

AGCTTATGCA AATAAAG7AT GAG7T7TTGG AAT7CCAA7T CAGAGTAGCC TCTTTGCAAG 107 0 

ACAT3TCTTG C CAACT ACTG GTGAATGCTG AAGGAACAGA CTGTTTAGAA G7CAAAGAAA 10S0 

AAGTGCATGT TATTGGAAAT 0307T7AAAC TTCTCTTGAA GGAGGTCAGT CGTCATATCA 114 0 

AG3AACTGGA GAAGTTATTA GACGTGTCAA GTAGTCAOTA G7ATTTGTCT T7CT3GTCTT 120 0 

CT3CTGATGA ACTGGACAGC T7AGGGTCT7 T3AGTCCCAY ATCAGGAA* >7 AGCAGCCGAA 126 0 

ACAGACAGAA AACGCCACGA GGCAAGT3TA GTCTCTCACA GCCTGGACGG TCTGTCAGCA 132 0 

GTGCACATAG CAG7TCCACA AAAG3TG3CT C7GATTCCTG CCTTTCTGAG CCARGGCCAG 13 8 0 

GTCGGTCCGG CCGCGGCTTC CTGTTCAGAG TCCTCCGAGC AGCTCTTCCC CTTCAGCTTC 144 0 

TCCTCXTTCCT GGT'TATCGGG CTTGCCTGCC TTGTACCAAT GTCAGAGGAA GACTACAGCT 150 0 

GTGCOGTCTC CAACAACTTT GCCD3GTCAT TCCACCCCAT GCTCAGATAC AC G AATGGC C 1560 

60 CTCCTCCACT CTGAACTAAG CAGATGCCAT CTGCAGAAGT GCTGGTAGCA TAAGGAGGAT 162 0 



35 



40 



45 



50 
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10 



15 



20 



25 



CGGGTCATAA 

GTGTGGCAGC 

AG AT AAAC AG 

AGACTCTGAA 

TGAATTCT<3G 

CAGCTTTCCC 

GTGTCGTACA 

ATTTTTAA3A 

TTTAACCTTT 

T AAAG AAAAA 

TGTTATAGTA 

TGTATTGATT 

T CGAGGGGGG 



CCAATCCCAA A C T AC C AAC ' A AGAGGACCTT GATCTTGGCG AAAGCCMTCG 163 0 

tttag2ctcc t 2cagatcac atgtgtgcaa al v r atggc tt cagaggtgga 174 0 

tgacggggoa ajaaacagac aacaagaagg tttggaagaa atctggtttc iboo 

CCTTAGCACT AAGGA 3 ATTG AGTAAGGACC TCCAAAGTTC CC C GG ACTCA 186 0 

GCCCTT3C>::C IJATTCTGTGC ACAGCCAAGG ACTTCAGTAG ACCATCTGGG 192 0 

ATGGTGCT:*: TCCAACCATC AGATAAATGA CCCTCOCAAG CACCATGTCA 19S0 

ATCTACCAAC CAACCAGTGC TGAAGAGATT TTAGAACCTT GT AAC AT AC A 2040 

GCTTATATG3 CAGCTTCCTT TTTAGCTTGT TTTC 2TTTOG GOCATGATGT 2100 

GCTTTAGAA3 CACAAGCTGT AAAT 2T AAAA GGCA2TTTTT TTTAGAGGTA 2160 

CTAGA1GTAA TAAATAAGAT CATGGAAGGC TTTATGTGAA AAAA3TTGAA 2 22 0 

AAAW-AAAG ATATTTAT 3-T ATGTACAGTT TGCTAAAGCC AAG TTTTGTT 2 2 80 

TCTTTGCATT T AIT AT AG AT ATT AT AAAAT AAAAAAAAAA AAAAAAAAAC 2 34 0 

GCCCGGTACC CAATTCGCCC TATAGTGAG 2 370 



30 
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(2) TNFORMATICri FOR SEQ ID HO: 17 6: 

( l ) SEQ'JEI IC S CHARAC 1 ~K 1ST ICS : 

< A ) L rvNGTH : 13 4-} base pairs 
(2) TYPE: nucleic acid 
CD STRANDEDNES3 : double 
; CO TOPOCCGY : 1 mear 

(xi) SEQUENCE E'ESCRIPTION: SEQ ID UO : 176: 

CCGOCTTCAC GAT^JCGGCG GTCAGTGGTC CAGGTCCCTT ATTCT'GC C TT CTCCTCCTGC 

tcctggajcc cca2A>cc:t gagacogjgt gtcztcctct acgcaggttt GAGTACAAGC 

TCAGCTTCAA AGG 2C :AAGG CC^CA'TCCC CTG7^CTOC AATACCCTOC TOOAOCCATC 

atgga«3gtga gg:-gcagg3G tcsggaecgg tatscccagg gtccctgaaa gi'-ctggagg 

CtGCTGTRACT TG3TCOGGAG TGCGTCTGTC ACACGCATCC TCTGTCCAGG GTGGGGCAAG 
GCCIGGGACA GTCOCA^GCA CCCCAG3ACC CCTTCCAG3C TTGTCTCCTG CTCCACCGCC 



60 



24 0 
3 00 
360 
42 0 
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C TOGOGGCTA C CTG3 AGGG A A- 3 3 AT 7 2TC A 
3TATOTGTGT TGTG33TGGA AG~AGGCATG 
5 GGG ACTT 2C A AACCCAGACC TA 2AAAGAGT 

3GTTTG T 3 AT GATOGAACTA CAC 3ATAGAG 
G3A3C 3TGAA ATAGTCTAOG A' 3 TAT 333 TT 

10 

rGAGAGTT' 33 GGG AT 3GATT rAATT 3 TG AG 

■ -ag at- : 33ct gto 3atttc t gt 3tataaaa 
15 cg:ctgtaat c3cg3 3gctt tg33ao3cca 
3 aaac 3 at' :g t 3 3 3c 3g aa t gg t 3am 3g gg 

G3TGTG3TGG CGTG3GCCTG TOjTC 3CAGC 

20 

TGGGCCT03G A33CC3AGGT TG 333TGAGC 

G AC AG A 3CCA OACTCTGGCT GAAA_A-\AAAA 

25 CAATTCGCCG NAT AT 3ATCG TAAACAAT 



428 



T7CCAGGT3A GTOGG3ACCA GCCCTTCCCT 660 

A 3 A3C ATC T T AGCC-7ATAGG T3 CGTATTCA 72 ; > 

GTGTCTTCTA C 3AGATC TT 3 TTC AAAJAAAG 78 ^ 

G 3 A 3TGAGCA AG AA : AA.T3 A G3 ATT AG AGT 8 4 ' i 

C G AAA AC A T A T3CT3TGAGG TCTGTCCACC 90 'J 

CCTCTTAGCA GGCAAAG2AA A3 A. 3 AG AAA 3 96'.) 

T3TGAGTTCT T03CCG3GTG CQ3T3GCT3A 102 0 

Q3G3G3ATG3 GTOG3GAGGT CAGGAGGTTG 108' ' 

TGA3TCTACT AGAAGTG2AA A3ATTG3CTO 1140 

TTCTCGGGAG G3TGAG3CG3 GAGAGTTG3T 1200 

TGA 3ATCCTG C GATTG3AC I 1 T3AG3CT03G 1 26 

AAAAAAAAAA A3T3GAGGGG G3CCCGTACC 13 2? 1 



1343 



30 (2) information for seq id no: 177- 
( 1 ) sequence characteristics : 

(A) LENGTH: 15 02 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRAND EL NESS : double 

(D) TCPCICG V : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 



40 CTCAAAATAA ATAAATAAAT AAAAATTT G T ATTCCATTGA TTTG03TAGA CACCAGGAAT 6 0 

GTGCATTTCT AACAA03TTT CCAGGCGATC CTATA3TAAG TCATCTGTG3 ACTAOTTTAA 120 

G AAACT CTT C TATAGAGAAT G3AGTTG3AT TAATAATAG3 TGATTTTTTA CACTG3ACTG 180 

45 

ATTCACAAGA ACCTAAACAG TAGTCCATGA AGC TO 3T CAT CTGTG3T.AAC TATTT 1 3GC CC 240 

CGTCTCACTC T 3AAAGCAGC A 3GAGATGTT GTTTACTTTG TTTCTATCCC CTTT3TCTGG 3 00 

50 AGATTAATTT T<3GAATGAAA GrTTTTCTCT CTATX3CCATT CCTGGTTCTT TTCCAAAGCC 3 60 

T CAT AC AAG A G3ATTAGGTC A 3Al.AT3C.ATG CATTACCTTT TAAAAGAAT3 ' 2 G A T ATTG AT 4 20 

ACCGATGCTT ACTTTTTTTT TTTTTT IAC T A CTTGTTTTAT TCCTTCCAGN AAAGT AT AGC 4 80 

55 

CCGCCTTTCT AT AGC AT AGT TCTCTTTAGG TGGAATGATT CCTATAAGAT TTCT7ATTAT 54 0 

TAAATCATGC ATTTTTCAAG AT03AATCAA TMTTTGATTT AATCTAA3CT GATATTCTCA 600 

60 TTTGTTAGAA GAACAACCTA CATGCTAGAG AG AG A 3GA03 AAATATACCC ACGACCACAC 66 0 
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AGCCAGTTAG TATCCAGTTG GTCCTGGACT CCAGCCAGGT GTCCTGCCTC ATGGTAGTTA 72 0 

AATGATATAT AGAAAAGGTA AATTTTTAAA GAAATATTTA TTAATATATT CCTATAAAAC 7 80 

ATTTTAAAGG TAACCACATA AAAATCGTTA ATTTTTCCAT TCCAAAGTAA ATGCTAAGCA 34(1 

TGTTTATTAA TG AA'GC AGT A C r T T ' "J TG ATT A CTATATGACA TTCTGAACTT AATTAAACTC 



GA 



90 0 



10 ATTGCACTAA ATGTGTCTTC CTT3GTATAG TGGAGGATTT GAGGATTGGA ATATAGAGTA 960 

G AGTGC TTGC TTAAGCCTGG GAGCCCATCT TTATAGCTAT TTG AT GT AAG AAAAGAGACA 102 0 

TGGNC C ATTT CTAAACTATA TAAGOTGAGT GT3TCTATTC CCAGCAGATA TAAAGG/\AAA 1080 

AG3AAACTTT TTTGATTCCC ACCTTCCCAG 3CTCAC3TAG CCATCTTCCA GGCTCAAATA 114 0 

TAGAGATGTT AGTGCAAG3T CCTGGGGT TT AGGTCATCAT TTG AT AAG T G G TTT AG AG AT 1200 

AAAGAAAAAG TAGTGTTTGT AT ^TTT TTTT I-IMuIAAlC CCAAAACAAA TTT AT ATTGT 12 0 0 

ATTCAGCAAA ATTGGAATT2 AG:.1GTTTAA TTTTAGAACA TGAAGTGC2T GCTGTTTTAA 1320 

GCATTGACTT GTATAAAAAG AATTGGATGT CTCCAGTAAG CTTATG3GTT TTCTCATTTT 13 8(3 

TAGGTATATG GCTTTTAATC AT3TAAAGTG AAACATTAGT TTTCTTGCAT TTTATTACAG 14 4 0 

GTTGTTTGTT GCAAT AAAG A TGCTGCTGAA ATTAATTGAA AAAAAAAAAA AAAAAAAG TG 150 0 



1502 



35 (2) INFORMATION FOR SEQ ID NO: 17 S 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 37 base pairs 

(B) TYPE: nucleic acid 
40 <C) STRANDEDNESS : double 

CD) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEC ID NO : 178: 
45 ATTT f O 3 T AGG CCA IAAGGAC TGAAOTTOAG ATC C AAAAGT T<7ACTOCTA ATTATCTTGA -0 

O 3GT ATCAAG T CAAG AT AT A AAGACTG3AG CATG<30AGCC CTGACATCCC ATCTACAAAA 1P0 

50 

GGAAAGTAAC AATTCAAACT GGAACCTCAG GACCCGAAGC AAGTGCAAAA AGG ATGTGTT 2 0 

T ATGCCGC C A AGT AGT A 3TT CAGAOTTOCA GG AG A 3C AG A G3ACTCT2TA ACTTTACTTC 3'j0 



60 
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10 



20 



AM) 

T AAAGO AG AT GCT<3AAAGTG AACG ' TG TT GG AC AAAAAA GT CAGCTTGATA GAACTGTCTO 54 0 

C ATTTCTG AT GOT ^GAO -AT GTGGTGAGAC C CT GAG TG I 1 G ACCAGTGAAG AAAV2A3CCT 600 

T 3T AAAAAAA AW3AAAGAT GATTGAGTTC AGGATCAAAT TTTTGTTCTG AACAAAAAAC 660 

TTCTGGOATC ATAAACA^AT TTTGTTCAGC CAAAGACTCA GAACACAACG AGAAGTATGA 720 

G3ATACCTTT TTAGAATGTG AAGAAATCGG AACAAAAGTA GAAGTTGT3G AAAGGAAAGA 7 80 

ACATTTOCAT ACT3ACATTT TAAAACGTGG CTCTGAAATG GACAACAA3T GCTCACGAAC B40 

CAGGAAAGAC TTCACTGAAG ATACCATCCC AO ^GAACACA GATAGAAAGA AGCAAAV2AA 900 

C-2CTGTATTT TTC ■ 2 AO C AAA TATAACAAAG AA ( ACTCTTAG CCCCCCACGA CGTAAAG2CT 960 

TTAAGAAATG GACACCT2CT CGGTCACCTT TT.AATCTCGT TCAAGAAACA CTTTTTCATG 102 0 

ATCCATO3AA GCTTCTCAT2 G 2 TACT AT AT TTOTCAATCG GACCTCAG3C AAAATGGCAA 10 80 

TACCTGTGCT TT< 3- 3 A\< ;TTT 3T3GAGAAGT A.TCCTTCA3C TGAG3TAGCA AG AAC C O 2 AG 114 0 

ACTGGA>3AGA T7F 3TCA3AA CTTCTT.AAAC CTCTTGGTCT CTACGATCTT C3GGCAAAAA 1200 

25 C2ATTGTCAA GTTCTCAGAT GAATACCTGA CAAAGCAGTG GAA 3TATC2A ATTGAGCTTC 1260 

ATGGGATT-3G TG2ACC2rGA AG AC 2 AC AAA TTAAATAAAT ATCATGACTG G 2 TTTOGG AA 13 2 0 

AATCATGAAA AATTAAGTCT A T2 T I A AAC T CT 3CAGCTTT CAA3CTCATC T3TTATGCAT 13 30 

30 

AGCTTTGCAC TT2AAAAAAG CTTAATTAAG TACAACCAAC CACCTTTCCA C-C C AT AGAG A 144 0 

TTTTAATTAG CCCAACTAGA AGCCTAGTGT GTGTGCTTTC TT AATG T3 TG TGCCAATGGT 150 0 

35 GG AT C TTT GC TA2 TGAATGT GTTT3AACAT GTTTTG AG A T TTTTTTAAAA TAAATTATTA 156 0 

T TTGACAA 2 A ATCCAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 162 0 

AAAAAAAAAA AAAAAAA 163^ 

40 



45 



55 



2) INFORMATION FOR SEQ ID NO : 179: 



! i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2911 base pairs 
■:B) TYPE: nucleic acid 
( C) STRATOEDNESS : double 
50 -D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 

GGTGGTTTTT GTTCT02AAT A03CGGCTTA GAGGGAGGGG CTTTTTCGCC TATACCTACT 60 

GTAGCTTCTC CACGTATGGA CCCTAAAGGC TACTGCTGCT AC T AC GGG02 TAGACAGTTA 12 0 

CTG TCTCAGC TCTAiCGATGT (2CGTTCTTCC ACTAGAAGCT GTTCTGAGG 3 AG3TAATTAA 180 

60 AAAACAGTGG AATGGAAAAA CAGTGCTGTA GTCATCCTGT AATATGCTCC TTCTCAACAA 24 0 



BNSDOCID - WO ^PS.1963A2 
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431 



25 



35 



45 



300 
360 



TCTATACATT CCTGCTAGGT GC CAT ATT C A TTGCTTTAAG CTCAAGTCGG AT CTT AC TAG 

TGAAGTATTC TGCCAATGAA GAAAACAAGT ATGATTATCT TCGAACTACT GTGAATGTGT 

5 „ 

GCTCAGAACT GGTGAAGCTA GTTTTCTGTG TGCTTGTGTC ATrCTGTGTT ATAAAG^G 4.l> 

ATC ATC AAAG T AG AAATTT G AAATATGCTT CCTGGAAGGA ATTCTCTGAT TTCATGAAGT 480 

10 GGTCCATTCC TG-JCTTTCTT TATTTCCTG3 ATAACTTGAT TGT CTTCT AT GTCCTGTCCT 

ATCTTCAACC AG:: GAT'GGCT GTTATCTTCT C AAATTTT AG CATTATAACA ACAGCTCTTC 

T ATTC A< 3G AT AGT3CTGAAG ANGCGTCTAA ACT 3G AT CCA GTGGGCTTCC CTCCTGACTT 

15 

r rATTTTTGTC TATTGTCGCC TTGACTGCGG GGACTAAAAC TTTACAGCAC AACTTGGCAG 
GACGT03ATT TCATCACGAT GCCTTTTTCA GCC GTTCCAA TTCCTGCCTT CTTTTCAGAA 
20 ATGAGTGTCG CAGAAAAGAC AATTGTACAG CAAAGGAATG GACTTTTCCT GAAGCTAAAT 
C^GAAOaCCAC AGGCAGAGTT TTCACTCACA TCCGTC TTGG CATGGGCCAT GTTGTTATTA 
TAGTCCAGTG TTTTATTTCT TCAATGGCTA ATATCTATAA TGAAAAGATA CTGAAGGAAG 
GGAACCAGCT CACTGAARGC ATCTTCATAC AGAACAGCAA ACTCTATTTC TTT3GCATTC 1C2 0 

TGTTTAATGG <3CTGACTCTG GGCCTTCAGA GGAGTAAC CG T 3 ATC AGATT AAGAACTGTG 
30 GATTTTTTTA TGGCCACAGT c;CATTTTCAG TAGCCCTTAT TTTTGTAACT GCATTCCAGG 
GCCTTTCAGT GGCTTTCATT CTGAAGTTCC TGGATAACAT GTTCCATGTC TTGATGGCCC 
AGGTTAOGAC TGT^GATTATC ACAACAGTGT CTGTCCTGGT ctttgacttc aggoggtccc 
TGGAATTTTT C TT 1 GG AAGC C OGATCAGTCC TTCTCTCTAT ATTTATTTAT AATGGCAGCA 132 0 



540 
600 
66 0 
720 
780 
84 0 
900 
960 



1080 
1140 
1200 
1260 



1380 



AGCCTCAAGT TCOGGAATAC GGACCTAGGC AAGAAAGGAT CCGAGATCTA ACTGGGAATC 
40 TTTGOGAGCG TTC CAGTGGG GATCGAGAAG AAC TAG AAAG ACTTACCAAA CCCAAGAGTG 144 0 

ATGA3TCAGA TGAAGATACT TTCTAACTGG TACCCACATA GTTTGCAGCT CTCTTGAACC 150 0 

TTATTTTCAC t .TTTTC AGTG TTTGTAATAT TTATCTTTTC ACTTTGATAA AC C AG AAATG 
'GTTCTAAATC CTAATATTCT TTOGATATAT CTAGCTACTC CCTAAATGGT TCCATCCAAG 
■3CTTAGAGTA CCCAAAGOGT AAGAAATTCT AAAGAACTGA TACAGGAGTA ACAATATGAA 
50 GAATTCATTA ATATCTCAGT ACTTGATAAA TCAGAAAGTT ATATGTGCAG ATTATTTTCC 
TTGGCCTTCA AGCTT i GC AAA AAACTTGTAA TAATCATGTT AGC TAT AGCT TGT A r G AT AC A 



1560 
162 0 
1680 
1740 

1800 
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PC I /l S98/I1422 



GCAGTTTCTC AGACAGAACA TCTCAGAATT TTAATTTTTA GAAATTCATG G3AAATTGGA 2 ICO 

TTTTTGTAAT AATCTTTT 3 A TGTTTTAAAC ATTGCTTCCC TAGTCACCAT AGTTACCACT 216 0 

5 

T 3T ATTTT AA GTCATTTAAA CAAGC 3ACG3 1T3G03CTTTT TTCTCCTCAG TTTGAGGAGA 2 220 

AAAATCTTG A TGTCATTAOT CCTGAATTAT TACATTTTGG AG AAT AAG AG QG 2ATTTTAT 2 2 80 

10 T IT ATT AG TT ACTAATTCAA G3TGT 3 ACT A TTGTATATCT TT C C AAGAGT TGAAATGCTG 2 34 0 

GCTTCAGAAT CAT AG 2 AG AT TGT2AGT3AA GCT3ATGCCT AGG AA 2 TTTT AAAGG3ATCC 2 400 

TTTOAAAAG3 AT~A3TTAGC AAACACATGT T3ACTTTTAA CTGAT3TAT3 AAT ATT AAT A 2 460 

15 

CTGTAAAAAT AGAAAGACCA GTAATATATA AGTCACTTTA CAGTGCTACT TCACACTTAA 2 52 0 

AAGTGCATGG TATTTTTCAT GGT ATTTT GC ATGCAGCCAG TTAACTCTC3 TAGATAGAGA 2 580 

20 AGICA3GTGA TAG ATG AT AT TAAAAATTAG CAAACAAAAG TGACTTGCTC AGGGTCATGC 2 64 0 

ACCTG3GT3A TGATAGAAGA GTGGGCTTTA ACTGGCAGGC CTGTATGTTT ACAGACTACC 2700 

ATAC T3TAAA TATGA3CTTT ATGGTGTCAT TCTCAGAAA 3 TTATACATTT CTGCTCTCCT 2760 

25 

TTCTCCTAAG TTTCAT3CAG ATG AAT AT AA GGT AAT AT A 3 TATTATATAA TTCATTTGT3 2 820 

ATATCCACAA TAATATGACT GGCAAGAATT GC-TGGAAATT TGTAATTAAA ATAATTATTA 2 880 

30 aacctaaaaa aaaaaaaaaa aaaaactcga g 2 911 



35 -2) INFORMATION FOR SEQ ID NO: 180: 

'. . . ) S EQi JENC E CHARACTER H3TICS : 

(A) LENGTH: 519 base pairs 

( B ) TYPE: nuzleic acid 
40 ( :) STRANDEDNESS : double 

CO) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 180: 

45 GGCACGAGCC CCA(3GCCAGC C AG 3GO 2AGG CCTACTTT3G CCACCCTTAA ATTAGAAT3T 60 

GGGGTCAG03 GTCACAGAAA AGCGATTTCT CTGACCTA3T GTTTGGCGTC CGGGAACT3T 12 0 

GT3CCCAAC^ TT^AGACGCT '3GOAGTCCTC ACT<3AGGCCA TTGGCCCAGA GCCCGCCATC 180 

50 

CCC03ARACC CCC3GGAGCC '3CCT3TTGOC ACGTCCACAC CTGCCACACC CTCTGCCGGG 240 

CC3CAGCCC3 TCOrAACCGG GA3CGT3CTG GTCCCTGG-3G GTCCTGCCCC ACCTTGCCTT 300 

55 G3GGAGGOAT G3G2CCTCCT CCTCCCACC3 TGCCGGCCGT CAGTCACCTC TTGCTTCTGG 360 
TCCCCCAG3C CTAG3CCTTG GAAG 3 AGAC A GGAGTCTAG3 GAGGCTGAAG CCCACTCCCG " 420" 

G3GAGGCCCG T3CTCCTCCA ' 3C C CCAGGG A CAGCAAGGAA AAG A' 3 AAG AG AGC A 3AGC AT 480 

60 
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TTCATG3CTC TAATAAAAAA AAAAAAAAAA AAA AC T CG A 



(2) INFORMATION FOR SEQ ID NO: 131: 

■; i ) S EQUFNC E CHAP ACTE F. I ST I C 0 : 

(A) LENGTH: 968 base pairs 
10 ( F» ) TYPE: nucleic acid 

{ C ) STRANL EDNEG 5 : do lib 1 e 
( r. i ) TOPOLOGY : linear 



15 



(xi) SEQUENCE DESCRIPTION: GEO ID NO: 181: 
TCC7CTTGGG GCC 3GAAAAA GO GGGOTTGG CCTGNGCATT GGTTNTCCAT GO0GCCCGCC 
CATGGCCCAG TACTAOCOTG CAGTCCCOAAT GTACCCCCTC C ZTCYTCCMA GA3CCCYTCM 
20 aaccg:c2cg stcakttgtg ATTTCAGGAG GAT'ITGATGA AGATGTTAAA UCGAAAGTGG 
AGAA CCTTCT C G ^GATTTCC ACOOTGOVYA AAACGGACCC TGTTAGGCAA GCA2CCTGCA 

GC ^ nT .^ CCTG to^octtctt ccccTCcccT tcy::ccgccc gtggagacag ctgttytcag 

25 

CAC--GGCTCT 2 CG2AC;GGAGG GGGCCGGCTC CTTCCCTGGC AGCAACATCC TTGCCCTTGT 

CACACAAGT2 AG2CTCCATC TCXTGCAGCTC TGTGGATGCG CTGCTGGAGG GCAACAGGTA 
30 tgtcactogc tggttcagcc CCTACCACCG CCAGCGGAAG CTCATCCACC OGGTCATGGT 
TCAGCACATC CA:;CCCGCAG 0:3CTCAGCCT C2TCOGACAG TGGAGCACCC TCGTGCAGGA 

qittggacxxtt gccctgcagc t<^gctttcta ccc:x3ATCcc gtggaggagt aGCTGGAGGA 

AAACGTC ~tC AC CCCAGCCTGC AGCCXXTTGCA ARCTCTGOTG CJ\GGACC:TCA (iGGAGGTGTC 660 

toccccccog ctgccaccca ccagccctgg cago2acgtt .^ctcaggacc cctgagggga 
40 GA' 3CTCATGC CA:;GGGGCTC CTIX3TGGAGG CTOGGGGC^ tctgovyteoy -vwwggcct 
oggcaatacg gg7oacgtgg gcgtcgtgcc ctctggccca -.xvagtgtctt g7cgacactc 

, r . . r ,v - ^ - , ™- — p/^^ r \ 2 0 TGGG Z AGA G At 7T A GAAAA SAC AG AA< GG AAGC AG 

45 

CACACOGAOA CCCCCTTTGT GATC TOMATO TCTGACACTG ATTCTTPGGA AATAAAGAGT 
GGAA 3CTG 

50 

,• ••, ::*E"'~E>!AT"0"'i: FOR SEQ ID NO : 132: 



35 



60 
120 

240 
300 
360 
42 0 
4^0 
540 
600 



720 
780 
840 

9 0 0 
-60 
^63 



0 ■ TOPOLOGY - . :r^: -\r. 

60 
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ixi) SEQUENCE DEGCRI PTIGN : SEQ ID NO: 132: 
TGTAAAAGTT ATCAGTAATC CTAATTCTTT TCCTQ5GTTT TCCTTTTGTC ACTTATTAAT 
CAGTTTTTCA AAGGACGAAT GAATTTAGAG ATGTACTCTG G A 5CAGTATC ATGTTAAACC 
AGGGGTATAT T AG AAAAATC ATCCTCATAA T C ACT 1 3 TOG 3 AAGTTTTTCC TCCCCAAAAA 
AA3CCATCCT GATOOSTTTT CAAAACCAGA AAAWOCTCT TAYTGAGGAA CAGACCACTG 
' -A ST A" • 'AT GAGCATCTCA GGAAAACTGA GACCCTCGAG AACCCTTGAT TTCGT'XAAG ' 
CCCCAAGOTT TCA3AGCCAG CAGCCCAGTG CTOTOGTTGA CAGACGTGGT TTTKT^RGA 
AAGCAGJCAG AGGCCAGGAA TTTTCAGAGT CGTGA 3TCAC GRTYTCCCAC CCAAGATTAG 

AGCAMAG\TT ac-ccatactg agatttggta aaatcattct gt z t aa< g s aa tggaggtgtg 

TOG AMAC CTG CAGTOCCTGT TCACAGGGGA TGC AG GGA 1 3 A TCSYOG3TTT AGGAT3GGGR 
AOGOCACCGC ACCCCCYTTC AYT-GCTCTGC ACCTGCTCCC TCACGTOGAC ACTGT SCACA 
ACT3TOGCTC T CA 1 2A G Z A 1 A GTT'SCCCAAG GAGCTCATAT C7TATTOGAG ATAGQG3GTC 
GTACAGGTGA CATTCATGAG GAG TGT GA'GC CGGGTGACAT GGG3GTGTCA ACOCAGCATC 
TGTCCAGG AG CTCCTSCTGC AGCGGCTCTG GC AGGTGGCC TGA-SGCTCCT TTT' I?G AG AG A 
GAACTG'ICTG GGGTTGGTGT STGCTCTGCT CTGATCTGTT CTTTCTTGGA ACAGCACCCA 
AGAAGC T GAG CTCCTC 2ATC AGATTGTGA3 CTCCTGGACG GCAGGAGCTG TGTCCTTCTA 
TTGATCTTCC TATGCC ' AGA ACCTTGCAGA GATCCTGGAA TOTOGTAGGT GST 2AGTAAA 
TGTGTG'ITGA ATAAATGAAT GAATGAATGA AC AAATG AAT GAATTTGCTT ACTTCAAGGC 
AAAAGAACCA TG/VGvCTGTA TTTTGAGTTT CTATGTTATA GC AG! 1 C AGC A AAT C C T ATT A 
AATACTTTGT GTTTCCAAGC AAAAAAAAAA AAAAAAAAAA AAACTCGA 

(2) INFORMATION FOR SEQ ID NO: 183: 

< l ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 227 6 base pairs 
(3) TYPE: nuclei: acid 

(C) STRANDEDNESS : double 

(D) TOP3LOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 
CCGCGGCGTC TGACCTCATG GCGTAGAGCC TAGCAACAGC GCAGGCTCCC AOCCGAGTCC 
GTTAT03CCG CTGCCGTCCC GAAGAGGATG A GGG3GCC AG CACAAGCGAA ACTGCTGCCC 
GGGTCGGCCA TCCAAGCCCT TGTGGGGTTG '3CGC3GCCGC TGGTCTTOGC GCTCCTGCTT 
GTGTCCGCCG CTCTATCCAG TGTTGT ATC A CGGACTGATT CACCGAGSCC AACCGTACTC 
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35 



AA Z TC AC AT A TTTCTACCCC AAATGTGAAT GCTTTAACAC ATG AAAAC CA AACCAAACCT 
TCTATTT2CC AAATCAGCAC CACCCTCCCT CCCACGACGA GTACCAAGAA AAGTGGAGGA 
GC-VXITGTGG TCCCTCATCC CTCGCCT^CT CCTCT3TCTC AAGAGGAA03 TGATAACAAT 
GAAGATCCTA GTATAGA5GA GGAGGATCTT CTCATGCTGA ACAGTTCTCC ATCCACAGCC 
10 AAAGACACTC TAG AC AA TOG CGATTATO 3A GAACCAGACT ATGACTG3A2 CACCGGCCCC 
ACK3GACGACG ACGAGTCTGA TGACACCTTG GAAGAAAACA GGGGTTACAr GG AAA TTG A A 
CAGTCAGTGA AAT C TTTT AA GATGCCATCC TC AAAT AT AG AAGAGGAA3A CAGCCATTTC 

15 ttttttcatc ttattaittt TQ:mrx attgctgttg tttacattac atatcacaac 

AAAAGGAAGA TTTTTCTTCT GGTTCAAAGC AGGAAATGGC GTGATGGCCT TTGTTCCAAA 

20 acagtggaat accatcgcct agatcagaat gttaatgagg CAATGCCTTC tttgaagatt 

ACCAATGATT ATATTTTTTA AAKACTGTG ATTTGAATTT GCTTATGTA-. TTTTATTT3C 
TTGACTTTTT ATATGATATT GTGCAAATGT TTGCCATAGG CAATTGGTAC TTAAATGA3A 
GGTGACTCTC TCTTTT' jCCT T XJTGCTTTG GAAATTAAAT GTCACAAACG AG TAT AT AAT 
TTTTTATCTG T AC TTTT AG A GCTGAGTTTA ATCAGGTGTC CAAAATGTGA GTTAAACATT 
30 AC C TT AT ATT TACACTGTTA GTTTTTATTG TTTTA3ATTT ATT ATG 2TTC TTCTGGAAGT 
ATTAGTGATG CTACTTTTAA AAGATCCCAA AC TTG T AAC T AAATTC T j AC ATATCTGTTA 
CTGCTGACTC ACATTCATTC TCG3CCATTC .AAAT A C T ATT TTTTATCCAC ATTTTTTTTT 
GTTCCCAAAC TGTAATGTAC AAGGATATGT GTGATAATGC TTTGGATTTG AGTAATATTT 
TTTTTTCTTC CAAGAAAACT GCTTTG 3 ATA TTTTTAGA T A AT TT AAAC AT AATITAGGAT 
40 AATGATATTG CTCAATCTGA C C AC AA TTTT AGGTAAAACA TTAAATGTGT CAGAAATCTT 
C^AACAGAG ACTC'IXXZAGC TT GCAG A CAT AG AT AAA ATGTTACAGA GAT ACT A TTT 

;AA , rrGC ttttagttag ^aattcattg tagcatgggt tcctccaagg tttoaagcma 1o2 0 

TGGGCACAGT TTAAAATTAT ATCAGATTCG TTTACTTCGT TTATTATTTT ACAGTAAATT 

50 TGAACAAATC TTAGGGGTCA TTATCA'3TTA AAT AAT AC TG TACCTAG3TC TTTCAAATT A 
AAATTATACC TGAATGAAGT TGTTTGTATA CATAAAGGAT ATT ?G T GT AC AA'ITACCTTT 



45 



300 

360 

420 

480 

540 

600 

660 

720 

730 
840 
900 
960 

1020 

1080 

1140 

1200 

1260 
1320 
1330 
1440 
1500 
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10 



GCCGTCCATC CTGTCTCTTG GGCGG ACAGT GTACTTTCCT AATAGGGAAG GGAAGCACAA 2100 

TGGAAATACC CCTGAACCGT TTTATTGCAG TAATTTTTTT CATATCTGAA AC T ATT ATTT 2160 

AATATTTTGA AT AAG ATTTT AAAAAATAAA TGGO\AAG AT ATAAATGTAA AAAAAAAAAA 2 22 0 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAA 2 27 6 

(2) INFORMATION FOR SF.Q ID NO: 184: 

15 ( l ) SEQUENCE CHARACTER I ST ICS : 

(A) LENGTH: 25 00 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNES S : do ub 1 e 
{ D i TOPOLOGY : linear 

20 

(xi) S EQUENC E DESCRIPTI ON : SEQ ID NO : 134: 
TCCAAGCTAC 02CACTCGGG CTGGGGCGTT GGGAGCGGGA GTGCAGAGCG TOGTCGTGGC 60 
25 GO 2 GGC OjTG AGAAGAQ2GA GO:GKAGGAG GGGGTGCCAT i33CCGOGCAG CAGTTCCAGT 12 0 

ACGATGACAG TGGGAACACC TTCTTCT A< 2 T TCCTCACCTC CTTCGTGGGG CTCATCGTGA 130 
TCCCG3C3A2 ATACTACCTC TGGCCCCGAG ATCAGAATGC CGAGCAAATT CGATTAAAGA 240 
ATATCAGAAA A G T AT ATGG A AGGTGTATGT GGTACGTTTA C GGTT ATT AA AACCCCAGCC 
AAATATTATT CCTACAGTAA AGAAAATAGT TCTQCTTGCA GG A TGGGC AT TGTTCTTATT 
35 CCTTGCATAT AAAGTTTCCA AAACAG AC CG AGAATAC2AA GAATACAATC CTTATGAAGT 
AIT AAATTTG GATCCTGGAG CCAQAGTAGC AGAAATTAAA AAACAATATC GTTTGCTGTC 
ACTTAAATAT CATCCAGATA AAGGAGGTGA TGAGGTTATG TT C AT G AGGA TAGCAAAAGC 
TTATGCTGCT TTAACGGATG AAGAGTCCCG GAAAAATTQ3 GAAGAATTTG GAAATCCAGA 



30 



40 



45 TTC AATTCTG GTTTTACTTG TATATGGATT GGCATTTATG GTTATCCTTC CAGTTGTTGT 



50 



60 



300 
360 
42 0 
480 
540 
600 



TGGGCCTCAA GCCACAAGCT TTGGAATTGC CCTGCCAGCT TGGATAGTTG AC C AG AAAAA 660 



720 



GGGCTCTTGG TGGTATCGCT CAATACGCTA TAGTGGAGAC CAG ATTCT AA TACGSACAAC 7 80 

ACAGATTTAT ACATACTTTG TTTATAAAAC CC GAAAT ATG GATATGAAAC GTCTTATCAT 84 0 

GGTTTTGGST GGAGCTTCTG AATTTGATCC TCAGTATAAT AAAGATGCCA CAAGC AGACC 900 

AACGGATAAT ATTCTAATAC CACAGCTAAT CAGAGAAATT GGCAGC ATT A ATTTAAAGAA 96 0 

55 GAATGAGCCT CCACTTACCT GCCCATATAG CCTGAAGGCC AGAGTTCTTT TACTGTCTCA 102 0 

TCTTGCTAGA ATGAAAATTC CTGAGACCCT T 3 AAG AAG AT (2AQ2AATTCA TGCT AAAAAA " 108 0 



3TGTCCTGCC CTACTTCAAG AAATGGTTAA TGTAATCTGC - 2AA 2T AAT AG TAATGGCCCG 1140 



BNSDOCID -:WO l ■ 
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30 



40 



GATTAAAGCA TTGCAAGTTK GGAAGTTCAT 7,AGGCTGAAG CCTGTGCCA3 AAAATCACCC 
ACAGTGG3AT ACAGCAATAG AGGGGGATGA AGArCAGGAG G A I AGTGAGG GCTTTGAAGA 
TAGCTTTGAG GGAGGAAGAG GGAGGGACGA AGGAAGGTGG TGGA-TTAAG GCAGTTACTC 
TOGAAT3GWV GCCACAGTGT TTTC'CACCAT ATTT7GGCAA TTTTTTTTCC CCCTrmC 

45 GAAGTGTTTT CCT3TNAANCC C AGO AAC CAT TACAGAACCG 



1380 



437 

GAACCGTGAA G AAA' GGGAG T TTCGTG2TCC AACTTTGGCA TC C C TAG AAA ACTGCATGAA 12 00 

GCTTT7TCAG ATGGCCGTTC AGGGAC TTC A GGAATTTAAG TCTCCCCTTC TGCAGCTCCC 1260 
T C AT A T TG AA GAGGACAATC TTAGAGGGGT TTCTAATCAT AAGAAGTATA AAATTAAAAC 
T AT CC AGG .AT TTGGTGAGT T TAAAAGAATC AGATCGTCAC ACT CT ACT GC ACTTCCTTGA 
AGATGAAAAA TAT GAAG AGG TTATOGCTGT CCTT3GGA3T TTTCCATATG TGACCATGGA 1440 
T ATAAAAT( 'A CAGGTGTTAG ATGATGAACA TAGCAACAAC ATCACAGTAG GATCCTTAGT 15C0 
TACAGTGTTG GTTAAGTTCA CAAGG- 2 AAAC AATGGCTOAA GTATTTGAAA AGGAGCAGT2 15€-0 
15 CATCTCTGCT GCAGAGGAAC AGCCA'GCAGA AG AT 3GGC AG GG TG AAAC T A ACAAGAACAG 
GACAAAAGGA G^TOGCAAC AG^GAGTAA AGG A 2 CCAAG AAAACTG2TA AATCAAAAAA 
AAAGAAA2CT TTAAAAAAAA AA2CTA2ACC TGTG2TATTA CCACAGTCAA AGO AAC AG AA 

20 

AC AAAAG 2 AG G 2 AAATGG AG TCGTTGGGAA 1GAAGCTGCA GTAAA3GAAG ATGAAGAAGA 18-'0 

AGTTTCA2AT AAGGGCAGTG ATTCTGAAGA AG AAG AAAC C AAT AG AG ATT CCCAAAGTGA 18'f.O 

25 GAAA 1 GATG AT G G'T AGTG AC A GAGACTCTGA TAGAGAGCAA GATGAAAAAC AAAACAAAGA 192 0 

tcatgaagca CAGTCGCAAG AATT AC AAC A AAGCATACAG CGAAAAGAGA GAGCTCTATT 19 30 

GGAAAC CAAA TCAAAAATAA C AC AT C CT GT GTATAGCCTT TACTTTCCTG AGG AAAAA C A 2040 

AGAATGGTGG TGGGTTTACA TTGCACATAG ?,AAGGAG2AG ACATTAATAT CCATGCCATA 2100 
TCATGTGTGT A.CGCTG A A AG ATA2AGAGGA ■ 2GTAGAG2TG AAGTTTCCTG CACCAGG2AA 



1620 
1 6 'r. 0 
174 0 



2160 



35 GCCTGGAAAT TATCAGTATA (2TGT3TTTCT GAGATCAGAC TCCTATATGG GTTTGGATCA 222 0 



2 2 HO 
2 3 4 0 
2400 
2 4<; 0 



50 (2) INFORMATION FOR SEQ II) NO: 13 5: 

(i) GEQ'JENOE CHARACTERISTICS 

{ A ) LENGTH: 13 37 base pair: 



60 • - vri r - 



WO 98/54963 PCT/US98/1 1 422 

43 S 

TCTCCCTGGC GTTTGGTCAC CTCTGCTTCA TTCTCCACCG CGCCTATGGT CCCTCTTGGA 12 0 

GCCAf;CGTGG C3GGCCTGGC GGCTCCCCXGG TGGTGAGAGA GCGGTCC3GG AACGATGAAG 18 l '; 

5 

GCCTOGCAGT GCTGCTGCTG TCTCAGCOAC CTCTTQ3CTT CCGTCCTCCT CCTtjCTGTTG 24"' 

CTGCCTGAAC TAAQOCtGGYC CCT< 5GMA3TC C TGC T ' XI AG G CAGCCGAGGC CGCGOCAGGT 30 1 

10 CTTG iGCCTC CTGACCCTAG ACCACGCACA TTACC3CCGC TGCCACCGGG CCCTACCCCT 3 6'!) 

GCCCAGCAGC CGGGCCGTGG TCTGGCTGAA GCTGCSGGGG CG:GGGGCTC CGA3GGAGGC 420 

AATGOOAGCA ACCCTGTC^C C^GGTTTGAG ACGGACGATC AC GGAGGGAA GG3CG0GGAA 4 3) 

15 

GGCTGGGTGG GTGOIGGCCT TGCT^TGAGC GCC.AACCGTG 30 OACAAGCC CATGAOCCAG 54 0 

CGGGCCCTGA CCGT'jTT'jAT GSTGGT'GAGO GGCGC3GTGC TGGTGTACTT CGT'GGTCAGG 600 

20 ACGGTCAGGA TGA3AAGAAG AAA 0CG AAA 3 AGTAGGAGAT ATGGAG TTTT GGACACTAAC 66 J 

ATAGAAAATA TGGAATTGAC ACCTTTAGAA C AG GATGAT G AGGATGATGA CAACAGGTTG 72 0 

TTT'^ATGCOA ATCATGCTCG AAG AT AAGAA TGT'3CCTTTT GATGAAA.GAA CTTTATCTTT 780 

25 

GTAGAATGAA GAGTGGAATT TCTATGTTTA AG 3 AAT AAG A AGGGAG TATA TCAATGTTGG 84 0 

GGGC-GTATTT AAGTTACATA TATTTTAACA AC 2TTTAATT TGCTGT1GCA ATAAATACCG 9C0 

30 TATCCTTTTA TTATATCTTT AT A TGT AT AG AAGTACTCTR TTAATGGGCT CAGAGATGTT 960 

GGGGA.TAAAG TATACTGTAA T AA.TTT AT CT GTTTGAAAAT T AC T AT AAAA CGGTGTTTTC 102 0 

TGATCGGTTT TTGTTT C CT G CTT AC CAT AT GATTGTAAAT TGTTTTATGT ATT AAT CAGT 10S0 

35 

TAATGCTAAT TATTTTTGCT GATGTCATAT GTTAAAGAGG TATAAATTCC AACAACCAAC 1140 

TGGTGTGTAA AAAT AATTT A AAATTTCCTT TACTGAAAGG TATTTCCCAT TTTTGTCJGGG 1200 

40 AAAAGAAGCC AAATTTATTA CTTTGTGTTG GGGTTTTTAA AATATTAAGA AATGT CTAAG 1260 

TTATTGTTTG CAAAACAATA AATATGATTT TAAATTCTCT TAAAAAAAAA AAAAAAAACC 132 0 

CCGGGGGGGG GCCCGGN 1337 

45 



50 



(2) INFORMATION FOR SEQ ID NO: 186; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 941 base pairs 
{ B) TYPE : nucleic acid 
(C) STRANt'EDNESS : double 
55 ID) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 186: 

GGCACGAGCC TGGACGCAGC AGCCACCGCC GCGTCCCTCT CTCCACGAGG CTGCCG3CTT 60 

60 



BNSDOCIP - WO r*e 5 -1 96 3 A 2 I 



WO 98/54963 



PCT7US98/11422 



439 



10 



20 



30 



35 



AGGACCCCCA GCTCCGACAT GTCGCCCTCT GGT -GC CTCtT GTCTTCTCAC CATCGTTGGC 
CTG ATTCT ~C CCACCAGAGG ACAGACGTTG AAAGATACCA CGTCCAGTTC TTCAGCAGAC 



CAGCCCACCT CTC^CCCC AACCTGGCCT GCT'GATGAA 



ACCCAGCAAC TG3AAGGAAC GGATOOGCCT CTAGTGACAG ATC GAG AG AC ACACAAGAGC 
ACCAAAGCA3 CTCATCCCAC TGATGACAC - ACGACGCTCT CIGAGAGACC ATCCCCAAGC 
ACAGACGTC- AGACA GACCC CCAGACCCT~ AAGCCATCT3 3TTTTCATGA GGATGACCCC 
15 TTCTTCTATG ATGAACACAC CCTCCGGAAA CGGGOGCTCT TCGTCGCAGC TGTG31GTTC 
ATCACAOSCA TCATCATCCT CACCAGTGGC AAGTGCAT.SC AGCTGTCCCG GTTATGCCGG 
AATOTPXA GGTGAGTCCA TCAGAAACAG GAGCTO.ACAA CCYGCTGGGC ACCCGAAGAC 
CAAGCCCrCT GCCAGCTCAC CGTGCCCAGC CTCCT3CATC CCCTOGAAGA GCCTGGCCAG 
AGAGGGAAGA CACAGATGAT GAAGCTGGAG CCAGGGCTGC CGGTCCGAGT CTCCTACCTC 
CCCCAACCCT GCCCGCCCCT GAAGGCTACC TGGCGC=TTC GGGGCTGTCC CTCAAGTTAT 
CTCCTCTGYT AAGACAAA-1A GTAAAGCACT GT3GTCTTTG CAAAAAAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAACTCG A 



(2) INFORMATION FOF SEQ ID NO: 187: 



:. i) SEOl rQJCE CHARACTERISTICS: 

I A) LENGTH: 6E>4 base pairs 
(Bl TYPE: nucleic acid 
(CI STRANDEDNESS : double 
40 (D) TOPOLOGY: linear 

ixii JEOUENCE DESCRIPTION: SEO ID NO : 187: 

GAATTCGGCA CGA 5G^3CT TGTGCTTTAA A3GAGGTGTT CAAAGCATGT CTGA^AGAG 
45 „ „.„ 

AcrmoB: t:tgttttaa ttaatacttt aaaataattc atattta^v, .«*i^*ak» 

TTCCATAAAG AGGAGGATGT TTAAATGCCT CCAGACTACA TTCCTTTTTA TTSCTTGATT 
50 TTACCTCGGA GTCCAAAGTT CAATTCCCAT AAAC3CAAGCG TTTTATTTGT CACTTTCAAT 



130 



TCAACTATCA TGGACATTCA GGTCCCGACA CGAGCCCCAG ATGCAGTCTA CACAGAACTC 240 



A CACCACAACC CCAGACCCAG 3 00 



3 60 

420 

4B0 

540 

600 

660 

7 2 0 

780 

840 

941 



130 
240 



AT AC ATC C G A TTGCCATOCT TAAGATQ2AA TATGGGC 



GGAAATAGG7 TAACCCACAG 3 00 



60 ta -taaaaai acaaaaatcvv a-c a-xato gt- v --• 1 - ■ ■— i — 



WO 98/54963 



PCT/US98/11422 



10 



440 



GGAQ3CTGAG GCAGGAGAAT CGTTTGAATC TGGGAGTTGG AG3TTGTCAG TGAGCTGAGA 600 
TCGCGCCACA GCACTCCAGC CT A3GTGAC A GGGTGAGAC T CTGTCTCAAA MA3A 654 



(2) INFORMATION FOR SEQ ID NO: 18 8: 



(l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13-38 base-, pairs 

(B) TYPE: nucleic acid 

f C } STRANT'EDNESS : dcubl e 
15 ' D) TOPOLOGY: linear 

(xi) S E ; 'UENC E DE CC F I PTI ON : S EC ID NO • 18 8: 

GAAACTGGA^ CO jAGAACCG GAGCGAAGCG AAGCGGAAGO CCGGAATGAG GCCGGACTGG 6 0 

20 

AAAGC03GAG C3GGGCCAGG CGG02CTCCC CAAAAGCCTG CCCCTTCATC CCAGCGGAAA 12 0 

CCGCCG3CCC G3CCGAGCGC Q3CG3CCGCT GCGATT3CAG TCGCGGCGGC GGAG3AAGAG ISO 

25 AGACGGCTCC G3CAGCGGAA CC3CCTGAGG CT<3GAGGAGG ACAAACCGGC CGTGGAGCGG 24 0 

T<3CTTG3A3G AGC r IX3GTCTT 2G3CGACGTC GAGAACGACG AC-GACGCGTT GOTGOCXGOGT ^0 0 

CTGCGAGGCC CCAGGGTTCA AGAACATGAA GACTCGGGT3 ACTCAGAAGT GGAGAATGAA 36 0 

30 

OOAAAAGGTA ATTTTC CAC C TOAAAAGAAG C C AGTTTGG 3 TC-G ATG AAG A AG AT 3 AAG AT 42 0 

GAGGAAATGG TTGACATGAT GAACAATCGG TTTCGGAAGG ATATGATGAA AAAT3CTAGT 4 80 

35 3AAAGTAAAC IT T CG AAAG A C AA C C TT AAA AAGAGACTTA AAGAAGAATT CCAACATGCC S4 0 

ATGCX5AGGAG TACCTGCCTG GGCAGAGACT ACTAAGCGGA AAACAT CTTC AGATGATGAA 6 00 

AGTGAAGAG3 AIX3AAGATGA TTTGTTGCAA AG3ACTGGGA ATTTCATATC CACA.TCAACT 660 

40 

TCTCTTCCAA GAGGCATCTT GAAGATGAAG AACTGCCAO-C ATGCGAATGC TGAACGTCCT 72 0 

A CTGTTGC TC GGATCTCCAT CTGTGCAGTT CCATCCCGGT GC ACAG ATTG TGATGGTTGC 7 80 

45 TGX3GATTAGA T AATGC TGT A TCACTATTTC A 3GTTGATGG GAAAACAAAT CCTAAAATTC 84 0 

AGA 1 3CATCT A TTTGGAAAGG TTTCCAATCT TTAAGGCTTG TTTTAGTGCT AATG03GAAG 900 

AAGTTTTA3C CAC GAGT AC C CACAGCAAGG TTCTTTAT3T CT ATG AC ATG CTGGCTGGAA 960 

50 

AGTTAATTCC TGTGCATCAA GTGAGAGGTT TGAAAGAGAA GATAGTGAGG AGCTTTGAAG 1020 

TCTCCCCAGA T3GGTCCTTC TTGCTCATAA ATG3CATTGC T3GATATTTG CATTTGCTAG 108 0 

55 C AATGAAG AC 1 2AAAG AACT 3 ATTGGAAGCA TGAAAATTAA TOG AAGGG TT GCAGCATCCA 1140 
CATTCTCTTC AGATAGTAAG AAAGTATACG CCTCTTCGGG GGATGGAGAA GTTTATGTTT - -1200 

GGGATGTGAA CTCAAGGAAG TGCCTTAACA GATTTGTTGA T3AAGGCAGT TTATATGGAT 1260 

60 



BNSDOC'D <WO 98 54963 A 2 I 



10 



15 



WO 98/54963 PCT/US98/11422 

441 



1320 
1330 



TAAGCATT 3C CACATCTAGG AATO3ACAGT ATGTTGCTTG TGGTTCTAAT TGTGGAGTGG 
TAAATATATA CAATCAAGAT TCTTGTCTCC AAGAAACAAA CCCAAAGCCA ATAAAAGCTA 
T AATG AAC TT GGTTACA33T GTTACTTCTC TGACCTTCAA TCCTACTACA GAAATCTTG-3 1440 

CAATTGCTTC AGAAAAAATG AAAGAAGCAG TCAGATTGGT TCATCTTCCT TCCTGTmC^ 
TATTTTCAAA CTTCCCAGTC ATTAAAAATA AGAATATTTC TCATCTTCAT ACCATCGATT 

rrrcTo:GAG aagtogatac tttgccttog ggaatgaaaa gggcaaggcc ctgatgtata 
ggtto:acca ttactcagac ^ctaaagag actatttgaa gtccagttga gtcacaagag 

AAGCCTGTCT TGATATATCA TCTCAGAAAC TTTCCTGAAT atgtgataat ATATCGAAAA 

tgatttatag atccagctgt gcttaagagc cagtaatctc ttaataaaca tgtggcaggt 

TTTGTTTGAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAACTCGA 



1500 
1560 
1620 
1680 
1740 
1800 
1848 



(2) INFORMATION FOR SEQ IE' NO: 189: 

25 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1146 base pairs 

(B) TYPE: nuclei-: acid 

( C ) STRANDEDI TE 3 G : doub 1 e 
3Q (D) TOPOLOGY : Linear 

(xi) SEQUENCE DE SCR I FT I ON : SEQ ID MO: 18 0: 

AAAAAAAACC C AGGG 3 AACN TTGGGQG2CG CTTTNNIITTC CCCCTCCAGG CCATTOG3GA 

ATTC TTC AAG TTAATCCT3C riTGCT™ GCCAACAGGG CTTGTAGGGG GGAGAGACCC 

AGGATCATCA AGGGGTTCGA GTGCAAGCCT CACTCCCAGC CCTGGCAGGC AGCCCTGTTC 

GAGAAGACGC GGCTACTCTG TC<3GGCGACG CTCATCGCCC CCAGA.TGGCT CCTGACAGCA 

O2C0ACTCCC TCAAGCCCCG CTACATAGTT CACCTCGGGC AGCACAACCT CCAGAAGGAG 

„ „,^, r , r^TiTTTT rrrhcrrcc^ CTTCAAC'AAC 360 
GAG3GOTGTG AGCAGACCCG GACaG-.,Ov-I C^Ti.Cr.L^ ^- 

AGCCTCCCCA AGAAAGACCA CCG.lAATK.iAC ATCATKTCG TCAAGATGOr ATCGCCAGTC 

TCCATCACCT OGGCTGIX^CG AGO 2CTCACC CTCTCCTCAC GCTGTGTCAC 1KXTOGCACC 

AGC TGYCTC A TTTCCGGGTG .GO^GKACG TCCAGCCCCC AGTTACGCCT GCCTCAOACC 540 

TTGSGATGCG < " C AAC AT 2 A 0 CATC ATT j AG CACCAGAAGT GT3AGAACGC CTACCCCGGC 600 



35 



40 



45 



50 



120 
180 
240 
300 



■i „ j 



4 80 



660 



60 



WO 98/54963 



FCT/LS98/1 1422 



442 



CCCTCCATTT CCACTTGGTO TTTGGTTCCT GTTCACTCTG TTAATAAGAA ACCCTAAGCC 900 

AAGACCCTCT ACGAACATTC TTTGGGCCTC CTGGACTACA GG A'GATGCTG TCACTTAATA 960 

5 

ATCAACCTGG GGTTCGAAAT CAGTGAGACC TGGATTCAAA TTCTOCCTTG AAAT ATTG TG 102 0 

ACTCTGGGAA TGACAACAC2 TGG TTTGTTC TCTGTTGTAT CCCCAGCCCC AAAGACAGCT 1030 

10 CCTGGCCATA T ATCAAGG TT TCAATAAATA TTTGCTAAAT GAAAAARAAA AAAAAAAAAA 1140 



15 



ACTCGA 



(2) INFORMATION FOR SEQ ID NO: 190: 



1146 



i i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 90 6 base pairs 

{ B) TYPE : nucleic acid 

(C) STRANDEDME35 : double 

(D) TOPOLOGY: linear 

25 (xi) SEQUENCE DESCRIPTION: GEQ ID NO: 190: 

ACTCCCTCAC CCAGGTCCCA GCCCTGGGAA CCACCTACCG TGAGCCCTTT TGC AG AT AT A 60 

GACTCATTTC ATCCTCAGAT GGTCCTTCAA GGTAGGTACT TTAGTCCCAT TTTAGAGATG 12 0 

30 

AGACGATTGA GGCCAGAGGG GTGNNGTAAC TTGCCTGGGG GCTCACGAGC ACAAAAGGAG 180 

CCGAGGCAGG ATCTGACCCT TGTTCTCTGG CCTCACTGCC CTCACTTTGC CATGACrCGA 240 

35 AGTTATGTCC CT AC A AA'GC A ATCaTATGGTC CAAGGYTCTT TTTATTGTAT TTTTATTTTT 300 

AAGGGTCCTG TTCAAAACTG GTGTGAGCTC TGAGGAGTCC TGAACCCTGG GTGCAGCATC 3 60 

CTAGCATC2T GGGAGTCCTT TTCTGCCCAC ACTGAGCTGG GCTCCTCGAG GGGTGGGGCT 420 

40 

GCTGTCCCTG GAAGCCTGGC AGC AGCACTG TATCGGGTTG GCTGAAGCTG ARCGCCGTGG 4 80 

GGTGCAGGGC TCCMGGAATC CCCGTTTGGC TGAAGGGGTT CCCTGTAGCC MGGGATGTTT 540 

45 ATG AGGTC TC TCTGATGCSC CAGGCGCAGG ACATGTGTGC GGGTGGAGAA AAGCAGGCCC 600 

TTTCAGTGCC AGCTCCACTC AATTTCTATG TGGACCAAGA ACGATAAACT TAAAAAATTT 660 

TTTTTCCTAA GGTATCTTCA GAATATGGTG TATTTTTATG TGGAAAAGAA AAGTTATGAA 720 

50 

GGCAGCTGTT ACTTTAAGAG AAAATTCATT AAAAGTCCTC GAGGTATGAA GATGACGGCG 780 

TGCTTCTCAA TGATTTTGGC ATAACTTGAT TGTGGCTGTA ATTTTTTTTT TTTTTTTTGT 840 

55 CAAGCATGTC AGACAATAAA GTCTTTGTAA AAAGRGAAAA AAAAAAAAAA AAAAAAAAAA 900 
ACTCGA 

60 



. 906 



BNSDOCID <WO 9854963 A 1 



WO 98/54963 



PCTYUS98/1 1422 



44? 



(2) INFORMATION FOR SEC" ID NO: 191: 

(■) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 1941 base pairs 

IB) TYPE: nucleic acid 
( C ) STRANU'EDNES S : doub 1 e 
(0} TOPOL-IOY: linear 

10 Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 191: 

CTTCAGCTGA AGCC 7AGC3GA CCCCTTTTCC ACCCTGGGCC CCAATGCCGT CCTTTCCCCG 

CAGAGACT3G TCTT3GAAAC CCTCA2CAAA CTCAGCATCC AGGACAACAA TGTGGA3CTG 

ATTCTGGCCA CACCCCCCTT CAGCCGCCTG GAGAAGTTGT ATAGCACTAT GGTGCGCTTC 

ctoagtgacc gaaagaaccc g:;tgtgccgg agatggctgt ggtactgctg gccaacctog 



15 



25 



35 



TC3TG02CTT CCTAGAGGAC A3CCTTGCCG CCA-3ACAGTT CCAGCAGAGC CA3GCCAGCC 
TC3TCCACAT G2AGAACCCA CCCTTTGAGC CAAYTAGTGT .^GACATGATG CG.3C=3GGCTC; 
CCCGCGCGCT GCTTGCCTTG CCCAAGGTG:; ACGAGAACCA CTCAGAGTTT ACTCTGTAGG 
aatcacgggt gttgsacatc tcggtatca2 CGTTGATGAA CTCAKTGGTT TCACAAGTCA 
30 TTTGTGATGT ACT3TTTTTG NATTVJGCCAG TCATGACAGC CGTGGGACAC CTCCCCCCCC 
CGTGTGTGTG TGCGTGTGTG GAGAACTTAG AAACTGACT2 TTGCCCTTTA TTTATOIA.AA 
ACCACCTCAG AATCCAGTTT ACGCTGTGCT GTCCAGCTTC TCCCTTGGGA AAAAGTCTCT 

c:tgtttctc tctcctcctt ccacctcccc tgcctccatc acctcacgcc tttctgttcc 

TTGTCCTCAC CTTACTCCCC TCACGACCCT ACCCCACCCT CTTTGAAAAG ACAAAGCTCT 
40 G-Z CT AC AT AG AAGACTTTTT TTATTTTAAC CAAAGTTACT GTTGTTTA3A GTGAGTTTGG 
CGAAAAAAAA T AAAAT AAAA ATGGCTTTCC CAGT 3 CTTC-C ATC AAC GG3 A T3CCACATTT 

CATAACTGTT TTTAATGGTA AAAAAAAAAA AAAAAAATAC AAAAAAAAAT TCTGAAGGAC 
AAAAAAGGTG ACL3CTGAAC TGTGTGT3GT TT ATTC 7ITGT AC ATTC AC AA CCTTGCAGGA 
GCCAA3AAGT TCGCAGTTGT GAACAGAC 2C TGTTCACTGG AGAGGGCTGT GCAGTAGAGT 
50 GTAGACCCTT TCATGT AC TG T A 3 TGT AC AC CTGATACT GT AAACAT AC TG TAATAATAAT 
GCCTCACATG GAAACAGAAA ACGCTGGCTC AGCAGOAA^ TGTAGTTTTT AAAAATGTTT 



45 



6 0 
12 0 
180 
240 



™ w „^„-,^ mrr-.-TN.-T^ a nAA^rAHT ATCGGCAACC 300 



3 60 

420 

480 

54 0 
600 
660 
7 20 
7 30 
84 0 
900 
960 

102 0 

i : b o 

1140 
1200 
1360 



- \ y.--^ ~"\.~: *\ .\ 1 7 2 0 



60 GA2GTKGTAC AT'O .'A 'ATA C '""'OTTOGATC 



WO 98/54963 



PCT/US98/1 1422 



444 



CTTTATAGTA TGACGAGTTA ACAAGTTGGT 'GACCTGCACA A^GCGAGACA CAGCTATTTA 1560 
ATCTCTTGCC CAGATATCGC CCCTCTTGGT GGGATGCTGT ACA3GTCTCT GTAAAAAGTC 162 0 

5 

CTTGCTGTCT CAG2AGCCAA TCAAOTTATA GTTTATTTTT TT 2TGGGTTT TTGTTTTGTT 1680 
TTGTTTTCTT TCTAATCGA3 GTGTGAAAAA GTTCTAGGTT C\3TTGAAGT TC T'GA r rGAAG 1740 
10 AAAC AC AATT GAGATTTTTT CAGTGATAAA ATCTGCATAT TT GTATTTC A ACAATGTAGC 1800 
T AAAAC TTG A TGTAAATTCC TCCTTTTTTT CCTTTTTT03 CTT AATGAAT ATCATTTATT 
C AG T ATG AAA TCTTTATACT ATATGTTCCA CGTGTTAAGA ATAAATGTAC ATTAAATCTT 



15 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2118 base pairs 
25 (3) TYPE: nucleic acid 

( C ) STRANDEDNESS : doubl e 

(D) TOPOLOGY: linear 



30 



40 



45 GCAGTTGACG AGACAAGT C A GACCCAAAAA ACGACGCCAA GGTAGTGAGT G3GTGCCTAT 



50 



186;) 

192 0 



GGTAAGACTT TAAAAAAAAA A 1941 



(2) INFORMATION FOR SEQ ID NO: 192: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192 
AAATAATAAT AAHAATAAAT AAAAATWAAG TGCTTAKTGT AACTCAGCGG ACAGGGCTCC 60 
CAGCTGCTCT GGCACGTG2G ACACCYTCCA CCCTGCACAC AAC AGGCATG CAAAGAGGAC 12 0 

35 TGX3ATATGGT G3GGTAGAGT GCTTCTGGTG TGTTCACTTT AAGAAAACAT CTGCCAAGAG 
A< GAAGAGTGC CCAGGAAAGA CCA'3GAAAAT ACAAGTACAT G3CTGCTTCA T A C CAT AT AC 
CCCAATTCTT T AAAGC AG 2 A AAAGGCACTT TTTTTTTCAG G2 2AGAGTGA ATCTAAAACA 3 00 



180 
240 



360 



AACCTGGCTT TGCTTACAGG GAAGCTGTCC CAGAAGGACT GAGTGATGCC TCTTGTTCCC 
TAAGGTCTGG AG AGTCTTT G CAAGTTTCCA ACGACATTTC CAACCAGGTG G3AGAGACCA 420 



430 



TTGGGAGTAG GATGATTTGA GGAAAACAGG AAGAAAAACC GGTCAGAAAG TGGCACTTTG 540 



600 



GAAGTGGAAA GCTGTTTGCA AATAGCAACT CTGGCTAAAG CGAAAATGTT AAT CAAGT AG 

AAAGTAAAAT TCAGGATCTT AGAAGCTCAT CCTTCTGATG AGAACTATTT TTTTTTCCGT 660 

GAAGGAACTA TTATTACTTT AAAAGTGAGG GTAATTTACA TATGGGGTGT ATATATTCTA 72 0 

55 AAAATAGTAA T AAAAGT AC C TTTTATAAGC AATGTTGTGT GGCTTGTAGA AGAAAGCAGG 780 
GAGGAAAAAA AGGCAGGCAA AACTAGTCTA GGTCTAGGCC CTAAAAATGA GCTTCCTTCC - - 840 

CACTTGACTG GAAACGCCCA TGTGATTTCT AGGCTGAAAA TAGGTAGGAT TTAACGAGTA 900 

60 



BNSDOCID <WO ^854963A2 I 



WO 98/54963 



PCT/US98/11422 



445 



15 



960 
1020 



ACCTAGTTCC GTTCTGTCTC TGATTTCTGA TCAGCTGATG GAGCTGCTAG TAAGAGGGGC 
CGATCATGCT CCCAGACGAG TCCTTP3GCC TCTTSCTCTC CATCCCAAGC CTGACTCCTT 
C AGO AGCAGC CCZCTCCTTC TGTGTCCATG TGATGCAGGC AAG^GC AGT AAG AGGG 1030 
CATCCCATGT TCCAGTTCAC CTTCTATGGG GTCACTAEGA GGTTC CCGGT AACTAGGGCA 1140 

aqcmgitoc aaaagcagct gcaagcttca gaaacccact tc.tccaaca 120c 

CCAOGGAC^ GGCAGAGAGC C<*TCCAAAA GCCCACT3GG AGAOGCATAA GATTCTGTGC 126C 

CAG3CCCCCA GCTCCCCTCT GTGTCAGGTA GGCTCTGCTA CTGGCdCTG AAGTAAAGGC 132( 

^AHACAAAC GC*CAGGGCA GOG^GCAGC AATAAAAAAC TCTG< »ACAGA AACCCTTTTA 138« 

ATAAAGGAAA nCCACCCGT C^TCCTT CCA^GAAGG GTGA' jACCTT AATGTGATGT 144. 

AAGAGGAAGG T'TTTCTCTGG CTTTCAGGGA AACAGCTGCA GCTGAAACTT AGGGOCCCAT 150 

20 TCC AQ3C,:7AC rmCACCAC AGCCAGTGCA G~OjCTCCAA CTGCCACrGT CACCCCCATC 156 

ACTGCCAATT TCA3AAAGCG GTTGGTCCTT GGCITO3TCA GGACATCTTT T3TTCGATCT 162 

25 TCAGGCC3CA GAASTCCCCG AANA3CGCTG CCGCAC^CC ATATCAGGCC TCTGrTGGGC 16* 

TGATGCCAGC TCAAAGTCTT TGAAAGTAGA OGCTCCCGTC CTCTCAGCTT GCXTTIWGC 17< 

^,^ cz CGAGCAA'STT CGGATGGGGG AAACTG AAC A AAAAGGTCTC CTSTCTGCTG 18« 

30 AT^AGTGTCT CATAGGGCAA 3TCCTGAGGG ATCTGGGACA AC A'SGTGGTG GACCGAGGCC 

_ -„v-,-^ r-ar-rrrTG" TCGCGATACA ACACAATCAG GGCTGCAAAG 

A 1 rjii- mvj i 1 - -' 

35 TARATO 3GCA TCA3TSGGTG GCAC3GCCAGG .AAGAAGTCAT ATAACCGCAC GACGTGCCTG 
AAGTCA-jACA GGACATCCCC AAACCAGGTG ATGAOrCAGC TGAG3>3CAAA GATGGTCCCT 

r,^^A wTr^arr TCTQ^YTTCA CCTGGTCAAT GATGGGCATC 
A 1 :CTCAGCAC TCTGC AT3AA GTC a iGGA^ I c n*-> 

40 2118 
AGATAGTTTA ATATATGC 



1920 
1980 
2 04 0 
2100 



45 

(2) I MFC RMAT I ON F :R SEQ ID NO: 1:?j>: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 3 8 base pairs 
50 (B» TYPE: nucleic acid 

(CO STRANFEDNES3 : double 
( D > TO POLCGY : linear 



60 



W O 98/54963 



446 



PCT/l S98/11422 



15 



35 



45 



240 
3 00 
3 60 



600 
660 



GACGCGGAGG CACCTCTCGT CCCGAAACCG ACCAGAGGGC AAAGTGTTGG AGACAGTTGG 
TGTOTTTGAG GTGCCAAAAC A JAAT GGAAA ATATGAGACC GGGCAGCTTT TCCTTCATAG 

5 

CATTTTTGOC TACCGAOGTG TCGTCCTGTT TCCCTGGCAG GCCAGACT GT RTGACCGGGA 
TGTG3CTTCT GOAGCTCCAG AAAAA-3CAGA GAACCCT3CT GGCCATGGCT CCAAGGAGGT 
10 GAAAGGCAAA A-rTCACACTT AGTATCAG3T GCTGATTGAT GCTCGTGACT GCCCACATAT 430 
ATCTCAGAGA TCTCAGACAG AAGCTGTGAC CTTCTTGGCT AACCATGATG ACAGTCGGGC 540 
CCTCTATGCC ATCCCAG3CT TGGACTATGT CAGCCATGAA GACATCCTCC CCTACACCTC 
CACTGATCAG GTTCCCATCC AACATGAACT CTTTGAAAGA TTTCTTCTGT ATG AC GAG AC 
AAAAGCACCr CCTTTTGTGG CTCGGGAGAC GCTAAGGGCC TGGCAAGAGA ACAATCACCC 7 2'.' 

20 ctogctggag CTCTCCGATG TTCATCGGGA AACAACTGAG AACATACGTG tcactgtcat 7 bo 

CCCCTTCTAC ATGGGCATGA GGGAAGCCCA GAATTCCC AC GTGTACT^T GGCGCTACTG 
TATCCGTTTG GAGAACCTTG ACAGTGATGT GGTACAGCTC CGGGAGCGGC ACTGGAGGAT 

25 

ATTCAGTCTC TCTGGCACCT TGGAGACAGT GCGAGGCCGA GGGGTAGTGG GCAGGGAACC 
AGTGTTATCC AAGGAGCAt^C CTGCGTTC C A GTATAGCAGC CACGTCTCGC TGCACGCTTC 102 0 

30 CAGTG3G0AC ATGTGGGG OA CGTTCCGOTT TGAAAGACCT GATG0CTCCC ACTTTGATGT 1080 
TCGGATTCCT crCTTCTCCC TGGAAAGCAA TAAAGATGAG AAGACACCAC CCTCA3GCCT 114 0 



84 0 

90 0 
960 



1200 
1260 



TCACTGGTAG GC 0A3CTGAG GCCCCAACTG CCCAGG0TTG GTCACCGGGA AGAACAACTC 
TCATCCCACA ATTGCTGCAG AACTCTTCTC TCCCCATCAT G0GCCACAGT GGGTCTCTTA 

ATTTGATTGT vOGGGTTCTTT TTCD 0GGG AG GGG TGGT AT A ACTTTTCTTC AGAAGACCCA 132 0 

40 TGTGGGACAC CTCCAAGGCT GGCCTCCTCA TAAGC C CTGC CTACACCATG TTCCAGTAAA 13 3 0 

CCTCTCCACC AAGGAACTGT G TTC AGCTGC CAC AGGCC TG GAGGAGTTTC CTGGCCTGTC 144 0 

ACGTGAGG IT TGATCAGTAA ACCAGTGCAS GYTTGGCCAA AAAAAAAAAA AAAAAAAAAA 1500 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAACTCGA 1538 



50 

(2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS. 

(A) LENGTH: 1098 base pairs 
55 (3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 

60 



BNSDOCID - WO 9854963 A J t > 



WO 98/54963 



PCT/US98/11422 



447 



10 



AGACCCTGTC TCAAATAATA ATAATAATAA TAATCTTATT TTGGAGAATA AAGAGACCTS 
TGGATTTGAG GTGCCATTTG '3GTAGAAAGA AAAGACGTTT ACACCGAGAA ATAGTCTGTG 
TTGCCCTGAA G SAG- AGAGG GATGCATCGC TGGAGGTGAC CTACAGTTGA AGAAGACTCA 
TTAT jACAGA CCTTGTCCTT CrTCCTTCTC GAAA3TGTTT CCTCT3CTGC TACTGCTCAT 
GACACTCTTC CCC 7TCCCTG TCCCAGOGAA CCAAAGGGCT TTNCTACCAC ACC CTTTCTT 
N0CCCCCO0C r TC CCA TC TC TGCTGTGC ZT TTGTACTCAG CAATTCTm3 TTTGCTC CCA 
TTATCTTCCA GCCGGATACA GAGTGAATAG TTAACCACAC TTAGGTCAAA TAOGATCTAA 
15 ATTTTTGTTC CTCCTCCNGT GTAAAGAGGC CAGTGTTTGT GTGTTCOAAG CAC'CCTTGGA 
ATAGTAACTC 'ITCTCATTTG TTTGGGATCr GGOCAMCAAG TTCCAGAATG ATACACGGAT 
CAGTGCAGAA GTTCATCAGG CTCMKKX: TTAGGGCTGT ^AGAAGGC TTCAGCAGCA 
20 GAACTGATGG TKAWKGYTCG TGTTCTCCAT CCTCAACTTT CirTfJCTTCG ATCATACACA 

, ttcttgttc attg:aoccg tgttttgtga 

AGAATACATT TOG AAGGG :A AAAAAT'.,AAC "CIOIIOIIl 

25 CACAGATGCA CAGTCTGCTC TGAAGAC I'TT CTCTCAAGTG GSATYTGGGA G1CCATGCCA 

GATCATGGTG CTTCATGAGA GACTGACAGC TATCAGGGGT TGTGGCACTT AC;TGAGGACT 

CTCCTCCCCC AGTGTGTGCT GATGACACAT ACACACCTGA CAATAGCTTG AGTCTTCTCT 

30 GTTCCTTTTA CTCTGTACVCC AACATACACA TGATTTAAAA CCCITTCTAA ATATCTATCA 

. _ ,-.^jtt-tt.T AGTTCATTAT TATTTCCAAG 
TGGTTCATCC TTGTCCAAAT u_rt l .».'. -«" n.r-* --- 

35 GCGAATAGTT GGCTTTCTTT TTGCAAAAAT AATTAAAGTT TTTGTATGTT C-CAAAAAAAA 

AAAAAAAAAA CTACGTAG 



40 



(2) INTC5RMATION FOR SEj ID NO: 195 



(i> SEQUENCE CHAI^CTERISTIC.5: 
45 "(a) LENGTH: 10C1 base pairs 

(H) TYP£ : nuclei: acid 

( C ) £TT KANDEDNE SS: dc ub 1 e 

(D) TOPOLOGY: linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 135: 

GAATTCGGCA CGAGATACCT TGCATCTCAT CCCAGTAAAA CC.VCTTATTT A^ACATATC 



WO 98/54963 



PCT/IIS98/1 1422 



448 



AGGCCAGGCC TGATCCCTGA GGGATGCATG AGAAGGCTTG GAAT2TCATT CTGCTATGGT 360 

GGCTCTCTCT TGATCTT7TT GG AGT AG 2 AA AAA 7AGCAAT GTGGGCCCAA TGGTGTGGCC 4 JO 

5 

TAAATGATCA CAAA3GTAAA TG AGT AAAGG GCTCAvTCAGA TGAGTAAGGA C-CCTTGTCCT 4-3 0 

GAGAAATTAG CACTGGGCTC TG7ATTCAGA AACAT2TGAT AAGCATTGCC CATTGCACAT 54 0 

10 TO 7 C TTT A TT CTGTAAG3AC ATGAAATTCC AGTTTTGCAT AG2TAGTGAT GAATACCTGA SOC 

AG G 7 AATTGC AGACATATTT T ATTTT ATTT TTAATTGACA GA' POGAATTG TATATATTTA 660 

TCATGTACAT A A T CAT G7TT TAAAATATGT AC A ' FT ATGG A ATOGCTAAAT G AA AC T AAC C 7 2 0 

15 

TAG GGATT AT CT7ATATAAT TGTCATTTTT GTG7CGAGAA GACTAAAAAT CTACC 7TTTC 730 

AGCATTTTTA AAGAATA7AA TGTGTTTTAT TAA 7AACAGT C AG C ATTT GG T AC ACT AG AT 84 0 

20 CTCTTGAACT TGTTCCTCTT ATCTAACTGA GATCTTGTAA CCTTTGATAA CAGCTCCCAA 900 

GC7CTTCCC7 AACCACTG7T CCACCCGTGG TAACCACCAT TCTATTCTCA AGTTCCTGGT 96 0 

AA T G A C C A TT CTAGACAGAG OGAAGACTCT CTACCCTCTG A 1001 

25 



(2) IMFCRMATIOr: FCR SEQ ID NO: 196; 

30 

i) ::ec/JQK:E CHARACTERISTICS: 

( A ' LENGTH : 144 3 base pairs 
( 3 i TYPE: nucleic acid 
02' STRAMDEDNESS : double 
35 (G> TOPOLOGY: linear 

( :< i } G EG OTG JCE DESCR I PTIGN : 3 EQ ID NO : 19 6: 

ATAAACTGAA ATAGGTC "ATG CAAATATAAA AT ATT ATTTT TAAATTATTT GTCATAAGAA 60 

40 

ACGATGGTGG CCATATTTTG CTTTAATAAT GGAAAAAATG TGGTTAGCAT TCTKTGGAAG 12 0 

GTGGTCATCA GATAGTAGAC ATTTTCTAGG ATTTATTTCT AC C TGC AT AT GTGGAAATGT 180 

45 GTACTACTTT AGATTTATWT AATGGCAGGT AACTCAGAGG CATCAAAATG TGC T AATGGT 24 0 

GTAATATOGC CTTTGT7TTG CTGTYCTGTT TT GT ARG 2 C T TCAATCAAGC ARGOGOAG3G 300 

CCGTACAGTG AACTTGTCCT TTG3CAGACG CCAGCGTCTG CCCCTGACCC CGTCTCCACT 3 60 

50 

CTCTGTGTCC TGGAGGAGGA GC C C CTTG AT GCYTACCGTG ATT'TACCTTC TGCGT07CTT 420 

GTACTGAACT * 2*GG AAG AGO C GTGCAATAAC <2jGAT<7TGAAA TCCTTGCTTA ; 7ACCATTGAT 48 0 

55 CTAGGAGACA CTAGCATTAC CGTGGGCAAC ACCACCATGC ATGTTATGAA AG ATG TCCTT 54 0 

CCAGAAACCA CCTACCGGTG AGTOCAAGGG AGTAGAAATC TGC ATCAGCA CATC AGCAC T '600 

T-7GGGATCTA AGTAAACCTC TCGGGGAAAA TGAGCAAGTG GATGTCATCT CCCAGCTGTT 660 

60 



BNSDOCID - WU 9854963 A 2 I > 



WO 98/54963 



PCT7US98/11422 



10 



44 g 

TCTAAGAGCC CAGATGTCCA GAGTATTGCC TCACCorCGl-. u^'-T. 
■TGAAAAAGCC ACACTOOTTC AC-GGACTCAC CGGAG GGTTT CGTG" 
CCGTCTCTAC CCCAGAGTGG AGrCAGAATCC CCAAGCCGCOC CTCT^ 
AATTATAAAA 1 GGGCTTTG* GC AATATGTTAG CCCAAGAGGCT 7GGC 
GCCGACNTTA ACAGTGGCTT AAATGATGGT AAAACGTrrTA AGAT 
TTGGAGATAC GTTGACTTTT ATTAAACMAC CTATAGCCGT CTAA 



ATCTGGACXTT CAGGGGTTCA AGT GAGGG AA C A. C AC'ITCC GA GGATCA7COT 
15 AATGCCAGGT AAC C CGTT G A AATTATCAAA AAC AC CCC GG ACGT AC GA.GA 
GAGGATAGTT CTGTTATOGA GAAGATCAAA CGGTGCAGrOA GTGTAGGAAC 
TGAGCTTAGA TTTGGATAGT AAAACCTCAA C.ACCCGAGGT AAAAAGT A CT 
AGCATAAATA ATTTAATTCA GTGTTAANAT GC G AG . GG- 'ATA GT AT ATG*G- v. z 
AAAGAAACTC ACATTGGGAG AATGCCACCT COTTCC 



20 



30 



45 



TTTT AG AC AG ATGGAAATTG AATAGCTTTA '^AAAA. GGGGAA 
AAA 

(2) INFORMATION FOR SEQ ID NO: 19~ : 



(1) SEQUENCE CHAF-ACTERiiiiL^: 
35 (A) LENGTH: 1282 ia-e pairs 

{ B ) TYPE : nuc leic a c 1 a 
(C) STFAHOEDHESS: ccubl- 
( D : TO POLCGY : 1 i r. e a r 

40 (XX) SEQUENCE DE5CRI PTIOI : SEQ IT NO: 197: 

gaaaaaaaaa agtatgagcc agtaoctagg cacctotg-gc cccocgaagt 



AATTAAGTGT C AC AGT AT C A TCTTAGAAGT GAAAGAGCGCC CC/rCCCA.CCT 
TGTACGAGCA CGTAC r rGAGA AAG AAC ATGG 3 CGG - A_ - - ^ .G---_--^--"l 
TOCTATGGCT TGTATGTGTC CCCTCAAATT CGGAGTOCA7 OG CAATG' 



73 0 
1-54 0 
900 
950 
1020 
1030 



.GT CC2ACGAATGC 1260 

gcg go:gag.tgapga 1120 

'A -GATA-GCCCCT GGO-.GATGJGCA 13 3 0 



1440 
1443 



60 
120 



50 GTGGGGTCTT TAAGAGATCA CTAGGCCATO A^GaATTCTC : iAGGACGGG GAGGAAGGCC 300 
CATAATAAAA GAGCTTTCAG GGAGCATCCT CCT A O-OGOCGC CTTCTGi.-.TG _ ^---»Ov-/-. 



WO 98/54963 



PCT/US98/1 1422 



10 



15 



20 



25 



450 



AGGGCATAGG ATG AAC AAG T TACTGCTAGA CCTCTCACAA 
GTATTTTCAT CATTTJCTTGT CTCTTCCGAA GCCGaACAG'TA 

CAAA A.TTCMA AAT.AATTTCA AAGOGCCTAA AGOACTA^TT 
TAATGGTACT ACCACTCTCA AATTTAAAAT GT 2ATCTTAC 
ATTTATTGCT AAAACCTGGT AAACA.CTTTA ATCCYTTTCA. 
GTCCAGAATT ACTCGAAGAC TAATAGTCAC GT GACTTCT C 
GTCTAATTCT GGTTACAAAT AAGTAACTGC GA. AA 2 T AA 7 G 
TCTCGTCACT 'jCTTTOCT G A ACAATGTAAA AGCTCCCATT 
TTTCCACTGT GTATACAATA CATCCATGAT CTGTATC CA-3 
CTTTATACAC GAGC CCGGAT GCCACATCAA ATTAAATTAT 
AAAAAAAAAA AAAAAAAG TC GA 



84 0 
9C0 
9 6 0 
1020 
1 0 R 0 
114 0 
12 0 0 
1250 
1232 



30 



35 



40 



45 



50 



55 



60 



(2) GcCF 1-1X7 1 G r- J FT?: SEQ ID NO: 193 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 95 1 base pair; 
( b ) TYPE : nuc 1 e i c acid 
( G , l STRATIDEDNESS : double 
i C : TG POLCGY : line ar 



33 . 



(xi ) SEQUENCE DESCRIPTION: SEQ IG G; 
ATTTCGGAAC GA liGACTGAA GTGGG AGC GG C CG-CAGG-GT A. 1AAGAEAGA 
TGTGGTAACT AAAGAATGTT TCTGTTTTGT TAATTATTGT GTGTGTGTG 
TGCTTAAGAG AA TC AAAAAC TGAAAAAAAT GAX1AATACAG aAAATC-GCC 
TTTTGCTGTG TTTACAGCTT GTT AATGC T G TACTGTCCATT TTTTC-AAGA 
ACTGC C CAGG TGGTTTTGTG TCCTGAGCCC TACGCCCAGC CGACGTTA: 

gtttagatgt ttgattttgt tctgtttgct attgttatct taaac-gcc: 
atgccagaca tx^aaattaag ctcaaattaa gg-tctcgttt aaatgttt; 



TCTGAC 



TATATTCTAA TTGATGCCAG CCACTGATGC ATGTACTTTA C^ACGATCTG CTAAATAAGC 
ATATTAATTT TCCACATCAG ■ 3GCATCAG AT CTTGA.GAA.C G AACAGTTATC TAGAATTCCG 
TGTCTACTAA TGTTTCACCT 1 3CATGCAGC G TTCATT AA.TT TTGTAGGAAA ATACAAAGTG 
ATCATTATGT AGT7TTCTGGA TTAAAAAAAT TTGTGTGTGA AGTTOCTTTG TAJAAGTGCAT 



-0 
120 
180 
240 
300 
360 
420 
430 
540 
600 
660 



PNSOOGID - WO 98b4963A2 I 



WO 98/54963 



PCT/US98/11422 



451 



GTGGAATTAA TOGGACAGTG TGCCCTTTGT GTTAGATGTT AGAGCAAAAG AAAGGGCTTA 
T AGTGTTA GT ATTGGAGCAC TTTG AAG AT A GATATTTTCA GAAAAGATGT AGSATTTAAA 
5 AGTTAAATTT TAAATTTTAG AAAAAG AT AT G ATGGC AA' IT ^ATnoiL A.^.iGr^^ 

, ^^^r^r^, T^^r^r-p GGTAATGTOT AAACTGTGAG 
TCTTCATCCA GTAGGTGTTT AA^ AGTGTTA Hi^^- 1 

TGATTT AG AA TAAATGATTA T GAATT CAAA AAAAAAAAAA AAAAAAOTCG A 



10 



15 



(2) INFORMATION FOR SEC; ID MO: 199: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: n40 base pairs 
(E) TYPE: nucleic acid 
( c ) STRANDEDNES S : cloub 1 e 
9Q (0) TOPOLOGY: linear 

(xi) SEQ'JENCE DESCRIPTION: SEQ ID NO : 19": 

TTATTATAAT AATGATGATG ATTCCAAGGA AAAAACCTAG AGCGAA^TT CCATTTCTAC 

25 CGCGCACGCA G AC A Z T CTC C CTAACACTGA TAACCTGAG,: CCCCA3CACT <5GACGGAAGA 

ATCCTCGCGT CTC CGTGTGT ACTGOITCAG GGTTCTCGCC CCAXCTTGT CAGGACCCCC 

30 TGGTGTCCAG AGCCCCCACC CCTCCCGCAA CAAGCAGCTG ATC:CCCAGT GATTCTCTAT 

ACATTTTTCA CCTCGGCCAA TATGTGCAGG AAAACT.3GTT ACTTCTCTTT T C TTGC C TGG 

AGCCTTCATT GTTCACCCTT ACGTTGCAAT AT AGG A ATT A ATGCTACAAA ATAAAAGTAA 

35 AGC :ttacctg aaaagtgc at agt^ggggo aatggtatct agatctccca ctgt^saaa 

AC CAGC AAAG C AT 1 OA AAA 1 JT CTCAATTCTC CTGTTACCRA AT^AGATGT GAATTATAAG 
ATGTTTATGT T'TG AC C ATTG TTTCAACAAT GGGATTTTGT TACGAATTAT CCCTTTAACT 
GAAAC-CTCA GTTTT ACTG T TTACATTATT AGGAAAACAG GGATAT 2TTT TGAATCTAAA 
AATTTG ATGT AC AOS ATG TG ATTTTTGAAG TTT AC ATGT A AAGTC A GAGT ATAGGTGAAA 

taacgtttgt c at attttg a ga:gtatcct gcaqciatgt ttttacgtga gtg ' rrrr AC r 

C AAA STAC AT C^TAGACAGT CTTTCACAAT AAAAGGAAAA C^ATrTTTTT TCCTCCAAAT 

gtacatttat caacctaatg ATTGArrrrr ttaaaaagag atttcgcccc agtctggtt 
. ^ r .,^ r ^r- r^- tcaagttata aatttccaac 

ATGAAAGTTC ATTGCCCTAA AC i ^ i GC 



40 



45 



50 



720 
^30 
84 0 
90 0 
951 



540 
600 

660 
7 20 
7 80 

GGTTT M0 

900 



£>(.) AAATCGACCG CTT AATT ATT ACT7XA.V.T 



WO 98/54963 PC T/US98/I 1422 



4?: 



CAGGAACTTT ACTTCAO iGO TGTCCAGATT GCAGTTGTCG CCCGTGTATG TGGATCTAGT 1200 

TCACAGAGTC TTT03AA3CC A CCAGTCGTG CCGTCCGTAT AC T GT C C ACT CATTTTATGT 1260 

AGATTTGGTA TCCTCA3GAJ CCAGTGTTAA CA rCACTGTC ACOTAGTTAN CAGATTCATC 132 0 

TTTTATGTAT TT AAAGT AAT C GAT ACT AT G ATCTOOTTTT TCCCTGCACC ATTAATTCTG 13 30 

GCATCAGATC AGTTTTTGTG TTGTGAAGTT CTA 2TGPGGT TTGACCCAAG ACCACAACCA 1440 
TGAGACCCTG AAGT AAA jAT AAGGTACACA TA 3ACTATTT GAG TAACTGT TTCCTTG3GG 
GCCAATCTGT GTATGCTTTT ACtAAG'ITTAC AG AAT GC TIT TATTTTTGTC TATAAGAAAC 

AGTCTGTCAT TTATTTCTGT T'CATAAACCA TTT3GA'ZAGA GTGAGGACGT TTGCC CTGTT 162 0 

ATCTCCTAGT GCTAACAATA CACTCCAGTC AT 3AGCCGCC, CTTTACAAAT AAAGCA2TTT 168 0 

20 TGATGACTCA KAAAAAAAAA AAAAAAAAI 1C YCC-L^G-CGO:^ GCCGGTAA20 CATTTMNCCC 174 0 



15 



1500 
1560 



25 (2) INFORMATION FC R SEQ ID NO: 200: 

(i) SEQUENCE CHARACTERS ST ICG 

(A) LENGTH: 1707 bas^ pairs 
{ B ) TYPE : nucleic aci :1 
30 fC) STRANDEDNESS : double 

(C) TOPCLCCY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 2 00: 

35 GCTTATAGAA CrGG AG A* 3G AG OGAACATGGC AO' 'OOCTT G3 CGGTT TT*GCT GTGTCTCTGT 60 

GACCATGGTG GTGCOCCTGC TCATCGTTTG CGACGTTCC3 TCAGCCTCTG CCCAAAGAAA 120 

GAAG3AGATG GTGTTATCTG AAAA3GTTAG TCA3CTGAT 3 GA-YTGGACTA ACAAAA3ACC 

TGTAATAAGA ATGAAT<3GAG ACAAGTTCCG TCGCCTTGTG AAAGCCCCAC CGAGAAATTA 

CTCCGTTATC GTCATGTTCA CTGCTCTCCA A CTGG ATAG A CAGTGTGTCG TTTGCAAGCA 

45 AGCTGATGAA GAATTCCAGA TCCTGGCAAA CTCCTGGOGA TACTCCAGTG CATTCACCAA 



40 



50 



130 
240 
300 
360 



CAGGATATTT TTTGCCATGG TGGATTTTGA TGAAGGCTCT GATGTATTTC AG ATG 2 T AAA 420 



480 
^40 

600 



C ATG AATTC A GCTCCAACTT TCATCAACTT T 2 C TGC AAAA GGGAAACCCA AACGGGGTGA 
TACATATGAG TT A CAGGTGC GGGGTTTTTC AGCTGAGCAG ATTGCCQ3GT GGATG'GCCGA 
CAGAACTGAT GTCAATATTA G AGTG ATT AG ACCCCCAAAT TATG2TC-GTC CCCTTATGTT 
55 GGGATTGCTT TT«3GCTGTTA TTGGTGGACT TGTGTATCTT CGAAGAGTAA TATGGAATTT 660 
CTCTTTAATA AAACTGGATG GGCTTTTGCA GCTTTGTGTT TTGTGCTTGC TATGACATCT " ' 7?Cj 
GGTCAAATGT GGAACCATAT AAGAGGACCA CCATATGCCC ATAAGAATCC CCACACGGGA 

60 



780 



BNSDOCID <WO 9854963 A 2 



WO 98/54963 



PCT/US98/11422 



45? 



-vv-cVTT-TG TACCTGAAAC AGACATTOTT 

CATGTGAATT ATATCCATGG AA^oTCm,, 



_„.. r „ .,«r«TAATC T--TCTG-T7, GTATTGGACT TGTTOTATTA 
GACATGGATA TTGGAAAGt-^ ; .^,-.G^.l A-.iL, 

TTCTTCAGTT aBMOCTCK TATTTTTAGA TCTAAATATC ATGGCTACCC ATACAGCTTT 
CTGATGAGTT AAAAAGGTCC C AG AG AT AT A TAGACACTGG AGT ACTGG AA ATTGAAAAAC 



15 



25 



^CCTOTCT CAAAATCTGA ggtatttgaa aataattatc: ctgttaact tctcttccca 
GTC AAcrrrA to,aacattt aatt-tagta, aattaagtat attataaaaa ttgtaaaact 
,_-,^r., ^r,CTACTTT agtt.aacttg gtcatctgat 

ACTACTTTGT TITAM'i^ 1 ^ rt^..ru-*r^-- — - - 

r-rWTOCTG A r 'CAGGTGTT CCCACATATG 
TTTATATTSC CTTAT CC AAA GATGGO..AAA G r.W.TC, TO * — 

n aT ^- A - AT TAGC-AATTCA TTCTTAGCTT CTTCATCTTT GTGTGGATGT 
CCTGTT AC AG AT AAG T A^ AT I AW - ^ 

~ . -A-Aar V T-;-r CTGTGTCATG TGGTCTTCTG 
GTATACTTTA CGGATCTTTC CTTTTGAGTA .^AaAM G FoTGT . 

^■^zu— TPTATn-v ^GSAAGACAG TTGTTTCTCC 
AAAATSGAAC ACCAlTCTTC AGAGCA^ TCTA'j* 

^ -rv-^TACA GTGCTGTCTA TGATTGTTTT TGTTTT 3TTG 
^ i _ Tr , CT rr, GC ATAT'TT^ CTA CTCWVmI^C^ aj-l^ 



35 



(2) INFORMATION FOR SEQ ID NO : 201: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 779 base pairs 
4Q (B) TYPE: nucleic acia 

(C) STRANC'EDNESS : double 
(O) TOPOLOGY: Hnear 



840 



jTTTA ATGGTGGAOT TACCTTAU.A mT..-0/TG-^ i-- 



45 



50 



960 
1020 
1080 



10 v _ . _ rAATT^AT ATTTTGTATT acctcttttt 1 l 40 

GAAAATCGTG TGTGTTTGAA A^g^A^^ - 

„ „., Mr o r . n-c TGT AGT' yr 1 2TT AAC AAGC 
TT'SAAGTGAT TTAAATAG1T AATCATTTAA o^AAG^G,, T^TAGI jC 



JUQIJENCE rESCRIPTIOM: 3F0 ID NO: 201: 

m „ ^^^-v-. : -yv. r7v - ;A CAAAGTTTOA TECI'GTT^G 

CTGTCCCCAG TGTTTC^A^ - .-^G ^ — - 

.^ Arrrrr r A'^CTGGO AT GCAGGTCAGC CCTTC^CA^C GGGCGTGGCG 
TGTGGTGGCT CCAAGCCAm^ CAai^rt. 

TCGTCCTCTT CACACAXGCC ACGTTGCAGC CCCAA^CT CACCA^ CGTTTTTTAG 
AAACCC ATTT TCTTOGTCAT TTATAAAGCT GCTTTA.kG^ i^-TTIX, 



120 0 
1260 



1560 
1620 
1680 



30 1707 
TTTTTTYGAG ATCACGYTAC TGCX--CTC 



60 
120 
180 

2 o 



GO 



WO 98/54963 



PCT/US98/11422 



454 



TAGGTGTC A T AAATTTTAAG AAACCTGCTT TTAAGTACTA TTTATAGCTT TTTCTGTTAT 54 0 

ACTTGCAACC T AG TTTT AAA ATA JAT'CAGG ATTTTATGAA AGCTTT AT AC AGACATTTAT 600 

5 

aggaaact:a ttctttgatt tta^gtcoca tttj^-attga taacacttac tttataaaaa 6G0 

GATGCTTTTT GTCK3GATAG AGCCTTATAG TTTAAAATAT CTTCATATAT TGC C ATTTG A 72 0 

10 TCAAATAAAT TTCTT ACTT A GAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAACTCGA 77 9 



(2) I TIFC POTION FOR SEQ ID NO: 202: 

' l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH ; 1*17 base pairs 

(B) TYPE : nucleic acid 
20 (C) STPANDEDNEGS : double 

(D) TOPOLOGY: Linear 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 202: 

25 GGCACAGCTT TCTGTCTCTT CCTCGCTCCC TCTCTTTCTC TCCTCCCTCT GCCTTCCCAG 60 

TGCATAAAGT CTCTGTCGCT CC:CX;aACTT GTT : G 3C AJ\T' 3 CCTATTTTTT GOCTTTCCCC 120 

CGCGTTCTCT AAACTAACTA TTTAAAGGTC TGOSGTOiSCA AATGGTTTCIA CTAAACGTAG 180 

GATGGGACTT AAGTTGAACG GCA3ATATAT TTC ACTGATC CTCG03GTOC AAATAGCGTA 240 

tctggtgoag gccgtgagag cagcoggcaa gtgcgatocg gtcttcaagg c^ttttcgga 



30 



40 



50 



60 



ACATCAA3AC CGTGTGCACA TACT-GOGA'GG ATTTCCACAG CTTGCACGGTC A.AGCCCTTA 
CGGATTGCCA GGAAGGGGCG AAAGATATGT GGCATAAAC:T GAGAAAAGAA TCCAAAAACC 



CTATATTAAT CATGCTAGTA ACATGAAAAA TGATGOGCTC CTCCTAATAG <3AAGGCGAGG 
AGA'CGAGAAG GCCAGGGGAA TGAATTCAAG AGAGATGTCC ACGGACGAAA 'CATACGGTGA 



300 



35 CTGTTTGC TC AAGCTGGGCG ACACATGGCC AACTACCCGC AGO- TGXAC GAC AAG AC G A 360 



420 
480 



TCAACATCCA AGGCAGCTTA TTCGAACTCT G03GCAGCGG CAACGGGGCG GCGGGGTCCC 54 0 



600 
660 



TGCTCCCGGC GTTCCCGGTG CTCCTGGTGT CTCTCTCGGC AGCTTT AGCG AC CTGGCTTT 
45 CCTTCTGAGC GTGGGGCCAG CTCCCCCCGC GC3CCCACCC ACACTCACTC CATGCTCCCG 

GAAATCGAGA GGAAGATCCA TTAGTTCTTT GGGGACGTTG TGATTCTCTG TGATGCTGAA 720 
AACACTCATA TAGGATTGTG GGAAATCCTG ATT' ;TCTTTT TTATTTCGTT TGATTTCTTG 7 80 

TGTTTTATTT GCCAAATGTT AC CAATCAGT GAGGAAGCAA GCACAGCCAA AATCGGACCT 
CAGCTTTAGT CCGTCTTCAC ACACAAATAA GAAAACGGCA AACCCACCCC ATTTTTTAAT 



840 
900 



5 2) TTTATTATTA TTAATTTTTT TTGTTGGCAA AAGAATCTCA GGAACGGCCC TGGGCACCTA 960 



1020 
1080 



BNSDCCID <WO 9854963A2 1 



WO 98/54963 



PCT/US98/11422 



455 



10 



15 



ATAATTCACG 


CTCACGTCGT 


TCTTCCACAG 


TATCTTGTTT 


TGAT3ATTTC 


CACTGCACAT 


114 0 




( 7? AAAA" X3 AA 


AGGACAGACT 


GTTGGOTTTG 


'p; ^TTTOGAGC 


ATAGGAGGGA 


12 00 


C 1 AG A GGV A A 3 


GO 3CT 3 AGGA 


AATCTCTGG3 


3TAAGAGTAA 


AGGCTTCCAG 


AAG AC ATGCT 


1260 


GCTATGGTCA 


CT3AG3G3TT 


AGCTTTATCT 


■3CTGTTGTTG 


ATGCATCCGT 


CC.AAGTTCAC 


132 0 


TGCCTTTATT 


TTCCCTCCTC 


CCTCTTGTTT 


TAGCTGTTAC 


AC ACAC AG TA 


ATACCTGAAT 


1380 


AT3CAACGGT 


AT A 7j ATC AC A 


AGG2GGGGAT 


GT T AA ATGTT 


AATCTAAAAT 


AT AGCT AAAA 


144 0 


AAAGATTTTG 


ACATAAAAGA 


GCCTTGATTT 


taaaaaaaaa 


AG AC ^ AG AG AG 


atgtaattta 


1500 


AAAAGTTTAT 


TATAAATTAA 


ATTCAG3AAA 


AAAAOATTTG 


CTACAAACTA 


tagagaasta 


1560 


TAA.AATAAAA 




GAAAAAAAAA 


AAAAAAAAAW 


CTCGACCG-A 


AQ3GAAT 


1617 



CD i : jf : p.mat i on for ceq id i:o-. 2 3 3- 

■ i) 3EGCENCE CHARACTERISTICS: 
25 -A) LENGTH: 19 74 base pTiir = 

B) TYPE: nucleic acid 
' C ) STRANDEOrCESSS : dc-ubl e 
■' D ) TOPOLOGY : 1 1 near 



35 



i\g:;tgcag cgcaoxagag tatctga:gg cgccao3ttg 6 0 



30 -x:) :T:.. ! . r :-:CK DESCRIPTION: SEO ID NO: 2 03 

gaattcg gv- c gaggctoac 

CGTAG3T3CG GCACGAC^AG TTTTCCCGGC AGCO AGG AGG TCCTGAG2AG CATGGCCC<3G 

aggaq:g:ct tccctgccgc cccgctctgg ctctggagca tcctcctgto cotgctggca 

CTG2G3G3GG ACVGCCGGGCC G0CGCAG3A3 GAGAGCCTGT ACCTATG3AT C 3AT"GCTCAC 
40 CAG3CAAGAG TACTCATAGG ATTTGAAGAA GAT ATC CTG A TTGTTTCAGA G-3GGAAAATG 
GC ACC TTTT A CACATGATTT CAGAAAAGCG CAA 1 GAG AG AA ^3CCAGCTAT TCCTCTCAAT 
ATCCA-G3CA 3 ' ' G\ A r 3T TT AC C^AAGOT <KY,.^G3AGG C AG .A/AT AC IT CTATOAATTC 

45 ctgtccttgc o:t-ccigga taaaggcatc ato xmatc CAACCGTCAA TGTCCCTCTG 480 

C TGG3 A AC AG TGCCTCACAA G>3ATCAGTT 3TT- 0AAGTT G GTTTCCCATG TCTPOGAAAA 
50 C AGG AT 1 X3GG TGXAGOATT T3AAGTG3AT GTGATTGTTA TGAA.TTCTGA AGGCAACACC 

ATTCTCGAAA CACCTCAAAA 1GCTATCTTC TTTAAAACAT CTCAACAAGC TGAGTGCCCA o60 



120 
180 
240 
300 
360 
42 0 



540 
600 



60 



TAATGGA 



WO 98/54963 



PCT/DS98/ 11422 



456 



TSCCCTCCAG OACTAGAGGG AGAGOAGTGT GAAATCAGlA AATGeCCACA ACCCT3TCGA 9>o0 

AATGGAGGTA AATGCATTGG TAAAAGCAAA TG T AAG TK T T CCAAAGGTTA CCAGGGAGAC 102 0 

5 

CTCTGTTCAA AGCCTGTCTG CGAGCCTGGC TGTGGTGCAG ATGGAACCTG CCAT3AACCC 10RC 

AACAAATGCC AATGTCAAGA A3GTTGGCAT GGAAGACACT 'IXTAATAAAAG GTACGAAGCC IMC 

10 AGCCTCATAC ATGCCCTGAG GCCAGCAGGC GCCCAGCTCA GGCAGCACAC GCCTTCACTT 1200 

AAAAAGGCCG AGGAGCGGCG G3ATCCACCT GAATCCAATT ACATCTGGTG AACTCCGACA 1260 

TCTGAAACGT TTTAAGTTAC ACC AAGTTC A TAGCOTTTGT TAACCTTTCA TGTGTTGAAT 132 0 

15 

GTTCAAATAA TGTTCATTAC ACTTAAGAAT ACTGGCCTGA ATTTTATTAG CTTCATTATA 13 80 

AAT 1 2 AC IX"} AG CTGATATTTA CTTTTCCTTT T AAGTTTT C T AAGTACGTCT GTAGCATGAT 144 0 

20 GG T AT AG ATT TTCTTGTTTC AGTGCTTTGG GAGAGATTTT AT AIT AT 3TC AATT3ATCAG 150 0 

G TT AAAATTT TCAGTGTGTA GTTGGCAGAT ATTTTCAAAA TTACAATG2A TTT ATG 3TGT 1560 

CT" X5GGGC AG GGGAACATCA GAAAGGTTAA ATTG2G 3 AAA AATGCGTAAG TCACAAGAAT 162 0 

25 

TT }G ATGG T G CAGTTAATGT TGAAGTTACA GC ATTTCAG A TTTTATT3TC A 3 AT ATTT AG 168 0 

ATGTTTGTTA CATTTTTAAA AATTGCTCTT AATTTTTAAA CTCTCAATAC AATATATTTT 174 0 

30 GACCTTACCA TTATTCCAGA GATTCAGTAT TAAAAAAAAA AAAATTACAC TGTGGTAGTG 1300 

GG ATTT AAAC AATATAATAT ATTCTAAACA CAATGAAATA GGGAATATAA TG T ATG AACT I86 0 

TTTTGCATTG GCTTGAAGCA A1ATAATATA TTGT AAAC AA AACACAGCTC TTACCTAATA 192 0 

AACATTTTAT ACTGTTTGTA TGTATAAAAT AAAGGTGCTG CTT TAG TTTT CTGA 197 4 



35 



40 



50 



(2) INFORMATION FOR SEO. ID NO : 204: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 57 base pairs 
45 (B) TYPE: nucleic acid 

( C ) STRANDEDNESS : double 

(D) TOPCLOGY: linear 



{xi> SEQUENCE DESCRIPTION: SEQ ID NO : 204: 

CGGCCTTCCG GGGCAACCGT TCGTCCCAAC HCGGGAAAGG GTCCTGGAGN CGGGAACTAG 6 0 

GA3CCTCGGA AGTCCAA3GG C3GAGCGCCC TTTGCTAATA A GC CAAT ^ AG AAC ! 3TG AG AC 12 0 

55 GCTCCGGTGG GNCGGTGCCG TCGAGCGCGG GGTGGAGTCT GGGTGACTTG GCT3GCGGGA 130 

TG AAGTGC AG CTGCTTCAGG CTGAGGT3GC AGATAGTGAG CGCTGGTGGC GGAGTTAAAG 24 0' 

TYAAAGCAGG AGAGTAATWA TGAATAGCGC AGC GGG ATTC TCACACCTAG ACCGTCGCGA 3 00 

60 



BNSDOGID - WO y8f)496 3 A2 



WO 98/54963 



PCT/US98/11422 



457 



10 



GCGGGTTCTC AAGTTAGGGG AGAGTTTCGA GAAGCAGCC3 COCTGCGC. i CCACACTGTG 
CGCTATGACT TC . W ,; CT GC TTCTATTGAC ACTTGTTCTG AAGGATACCT TGAGKTTGGC 
GAAGKTGAAC AGKTGAC CAT WAHTCTGCCM AATATAGAAA GTTGAA3GAA GCAGTAAAAT 
TCAGTATCGT AAAGAACAAC AGCAACAACA ATGTGGAATT CASCCAGGAC TCCCAATCTT 
GTAAAACATT CTCCATCTGA AGATAAGATG TCCCCAGCAT CTCCAATAGA TGATATCGAA 
AGAGAA n'G-A AGGCAGAAGC TAGTCTAAT3 GACCAGATGA GTAGTTGTGA TAGTTCATCA 
GATTCCAA/A GTTGATi'IATC TTCAAGTAGT GAGGATAGTT CTAGTCACTC AGAAGATGAA 

15 gattgcaaat cctctagttc tgatacaggg haattgtgtc tcaggacatc ctaccatgac 
acagtacacw att::ctgata tagatcccag tcataataga tttcgagaca acagtggcct 

TCTGATGAAT ACTTTAAGAA ATGATTTGCA -XTGAGTGAA TCAGGAAGTG ACAGTGATGA 
CTGAAGAAAT AITTAG-rTAT AAATAAAAAT TTATACAGCA TGTATAATTT A'lTTTGTATT 
AACAATAAAA AITCTAAGA CIGAGGGAAA TATGTCTTAA CTTTTGATGA TAAAAGAAAT 

TAAATTTGAT TCAGAAAAAA AAAAAAAAAA AACTC3A 



20 



3 60 

42 0 

430 
540 
600 
660 
72 0 
7c 0 
840 
900 

1020 
1057 



30 (2) imf-:rmati:*: i-of sec- id no. 205: 

;i> ::E;:UE::iCE CHARACTERISTICS: 

(£) r rYPE: nucleic acid 
35 ( c ) STRAT 7DEPME SS : doub 1 e 

'E) TOPOLOGY- linear 



45 



1C, - 



Ui) i;K,-)L-EIICE DESCRIPTION: SEQ ID NO: 20 = 

40 G AAT1X20 jC A CGAGTCATCC CTCTCCCTCT TTCACTCCCT T AC TCTT AC T CTGTTTTTTG 

„ , .„ mi ^,-n^^ rrTT ^T7TTT TGTTTGTTTG TTTTGAGATG 

TGCTCCA_>Al. jv.v^ jA— i A'_ ^ I i i - 

^—^--p , : ^ :yrr ^ cc AOSCTQOAOT GCA3TGGCGC AAT-T CGGCT CACCACAACC 



GGCATOIGCC ACCACACCCA 1 SCTllAATTTT ATATTTTTAG TAG AC jATGGT GTTTCTCCAT 
50 GTTGGTCAGG CTGGOCTCAA ACTC 2CAACC ^ACGTGATN CC GCCTGCTT TGGCCTCCCC 

aaagtxx;tgg gmtacaggc ^agccact gcccccagcc t-tt^gctc ctttatactc 



60 
120 

180 



rcTGGCTCCc i^arrreAASC ;^\ttctccto cctcagcctc ccgagaagct ggggaxTAca 24u 



300 
360 



WO 98/54963 



PCT/l S98/11422 



10 



20 



25 



30 



40 



458 



TACACTCAGC CTGGGCAATA GAGGGACATG TTGTCTCTAA AAAAAAAAAA AAAAAACTCC 
A 



50 



(2) iriFORMATICM FOR SEQ ID NO : 20 6: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2465 base pairs 

(B) TYPE: nucleic acid 
iC) STRAMDEDNES3 : double 

15 (D) TOPOLOGY: linear 

(xi) 3:-QirEMCE DESCRIPTION : SEQ ID NO: 20 6: 
CCACCATTTA TCCAACTGAA GAGGAGTTAC AGGCAGTTCA GAAAATTGTT TCTATTACTG 
AACGCGCTTT AAAACTCGTT TCAGACAGTT TGTCTGAACA TGAGAAGAAC AAGAACAAAG 



AGCX3AGATGA T AAGAAA-j AG ijun^j ^ n> 



720 
721 



60 
120 



ACAGAGCTTT GAAAGGAGTT TTGCGACT'SG 180 



240 

3 00 



GAGTATTG02 AAAAC 5G AT? A CTTCTCCGAG GAGATAGAAA TGTCAACCTT GTTTTGCTGT 
GCTGAGAGAA AC CTTCAAAG ACATTATTAA GCCGTATT' 3C AG AAAAC CT A C C C AAACAGC 

T TO 2 f T 3 TT AT AA3CC3T3AG AAGTATGACA TAAAATGTGC TGTATCTGAA GCGGCAATAA 3 60 

TTTTGAATTC AT3T3TG3AA CCCAAAATGC AAGTC AC T AT CACACTGACA TCTCCAATTA 420 

TTCGAGAAGA GAACATGAGG GAAGGAGATG TAACCTCGGG TATGGTGAAA GACCCACCGG 480 

35 ACGTCTTGGA C AGGC AAAAA TGCCTTGACG CTCTGGCTGC TCTACGCCAC GCTAAGTGGT 54 0 

TCCAGGCTAG AjGCT AATGGT CTGCAGTCCT GTGTGATTAT CATACGCATT CTTCGAGACC 600 

TCTGTCAGCG AGTTCCAACT TGGTCTGATT TTCCAAGCTG GGCTATGGAG TTACTAGTAG 660 

AGAAAGCAAT C AGCAGTGCT TCTAGCCCTC AG AGCC CTGG GGATGCACTG AGAAGAGTTT 720 
TTGAATGCAT TTCTTCAGGG ATTATTCTTA AAGGTAGTCC TGGACTTCTG GATCCTTGTG 780 
45 AAAAGG ATC C CTTTGATACC TTGGCAACAA TGACTGACCA GCAGCGTGAA GACATCACAT 
C C AGTGCAC A GTTTGCATTG AGACTCCTTG CATTCCGCCA GATACACAAA GTTCTAGGCA 
TGGATCCATT ACCGCAAATG AGC GAACGTT TTAACATCCA CAACA-ACAGG AAACGAAGAA 960 

GAGATAGTGA TGGAGTTGAT GGATTTGAAG CTGAGGO 3AA AAAAGACAAA AAAGATTATG 1020 

ATAACTTTTA AAAAGTGTCT GT AAATCTT C AGTGTTAAAA AAACAGATOC CCATTTGTTG 1080 

55 GCTGTTTTTC ATT ' CAT AAT A ATGTC T ACAT TGAAAAATTT ATCAAGAATT TAAAGGATTT 114 0 

CATGGAAGAA CCAAGTTTTT CTATGATATT AAAAAATGTA CAGTGTTAGG TATTATTTGA "1200 

ATGGAAAGAC AC CC AAAAAA AAAAATGTGC TCCGACTAGG GGG AAAAC AG TAGTTCCGAT 1260 

60 



840 
900 



BNSDCXID -.WO ^496? A2 



WO 98/54963 



PCT/US98/11422 



10 



TTTTTCCCAT TATTTTTATT TTATTTTCTS GTTCCCCTAG CTTCCCCCCC TATTTTTGTG 132» 
TCTTTTATTA ACTAGT SCAT TGTCTTATTA AATCTTCACT GTATTTAATG CAG3ATGTGT 138:, 
GCTTCAGTTG CTCTGT T-TAT TTTGATATTT TAATTTAGAG GTTTTGTTTG •rTTTTTGACA 144 0 

CTAGTTGTAA GITACTITCT TATAGATGGT ATCCTTTACC CCTTCTTAAT ATTTTACAGC 
AGTACGTTTT TTTGTAACGT GAGACTGCAG AGTTTGTTTT TCTATATGTG AAGGATTACA 
ACAC AAAAAG TTATGCIGCC ATTCGAGTG- TCA'GAACTGA ATGTTTCTGC AGATCTTGTG 
GCATTTGTCT CTAGTGTC-AT ATATAAAGGT GTAATTAAGA CAC5AGTTCTC5 TTAATCTAAT 

15 cAACvrrrocr gt-tagttctg c;,ttag-::agt ataaaagcta atatatacta tatggtcttg 

CAACAGTTTT AAACCCTCTG CATAATTGAT AATAAAAATG CAT'jACATTC TTGTTTTTAA 
T AG ACTTTT A AAATCATAAT TrrAGGTTTA ACACGTAGAT CTTTGTACAG TTGACTTTTT 
GACATAGCAA GGCCAAAAAT AACTTTGTGA ATATTTTTTT CTTGTGTATA AGTGGAAAGG 

GCATrrrrcA catataagtg gxttaaccaa tattttcaaa agaacttcat catigtacaa 

25 CTAACAACAG TAACTAGZCC TFAATTATGG TCACAGTTCC TTATTGGTGT GTGTGAGATT 
ACTCTAGCAA CTATTACAGT ATAACACAGA TCATCTTCTC CACACACCCC ATCACCCAGA 
T AA.TTT AC AG TTCTGTTAAC AGTGAGGTTC ATAAAGTATT ACTCATAAAA AATTATCTAA 

30 GGAAAAAAAC AGAAAATTAT TTGGTGT^ CATCTTACCT GCTTATGTCT CCTACACAAA 2220 

_ ^p^a^tv-p taP't^ttgaT ATATGTATGC 22 SO 

GCTAAATATT CTAGCAGTGA TUTMi 1 ^ ±^<-.+ — 

35 TCTGGTACAC AGATGTCATT TTCTTGTCAC AGO ACT AC AG TGAAATACAC AAAAAATGAA 2:340 
ATTCATATAA TGAC TT AAAT GTATTATATG TT AG AATTG A CAACATAAAC TACTTTTGCT ',400 
TT 1 ^AAATGAT GTATGC'FTCA ■ jT AAAATC AT ATTCAAATTT AAAAAAAAAA AAAAAAAAAA 2460 

40 2465 
CTCGA 



20 



1S00 

15*10 

1620 

H?-0 

1740 

1F00 

18^0 

1 TOO 

10H0 

2040 

2100 

2160 



45 

C) ITTrORKATION FOR SFQ ID NO: 2 07: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 80 base pairs 
5Q (p) TYPE: nucleic acid 

(C) ST P AND EE. NESS : double 
(O) TOPOLOGY- linear 



WO 98/54963 



PCT/l!S98/U422 



15 



50 



460 



GCTGGCTGAA C GAGAGCA» 'X? AAGAAGCCAT TOC TC AGTTC CCATATGTCG AATTCACCGG 



AAATGAGTTG gtggcttiga TCCCAGACAG TGATCAGACA ttgcgccctc AGCGAA2TAA 



10 CCTGTTTCCG CATT2AGTCC TTGTGGATOA TCACGGCATC AAAGTGGTGA AAGTCACATT 



(2) INFORMATION FOR SEQ ID NO : 2 08: 



{ i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 372 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
55 (D) TOPOLOGY: linear 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 208: 

CAGTATTTCC CTCAGTACTG TAAGCAAAAG TGGTATGTTT TTCTTTCTTT ATGTCTACTC 

60 



?. 4 0 



GAGAGATAGC ATCACCTCTO TCACGTGCCA GOGGACAGGC TACATTCCAA CAGAGCAAGT 3 00 



360 



GCAATATGTC CTCCTGTCCA TCCTGCTTTG TCTCCTGGCA TCT'JGTTTGG TGGTTTTCTT 4 20 



430 



840 
900 



TAATAAGCAA GACTCCCTTG TAATTCTCAC CATCATGGCC AC C CTGAAAA T I AGG A AC TC 540 

CAACTTCTAC ACGGTGGCA3 TGACCAGCCT GTCCAGCCAG ATTOAGTACA 1GAACACAGT 600 

GGTGAA1TTT AC03GGAAGG CCGAGATGCG AGGACCGTTT TCCTATGTGT ACTTCTTCTG 660 

CACCGTACCT GAGATCCTGG T GC AC AA 2 AT AGTGATCTTC ATGCGAACTT CAGT3AAGAT 720 

20 TTCATACATT GGC7TCATGA CCCAGAGCTC CTTGGAGACA CAT : ACT AT 3 TGGATTGTGG 7 80 

AGG AAA' IT CC ACACXTTATTT AACAACTGCT ATTGGTTCTT CCACACAGCG CCTGTAGAAG 

aga<:jcaca'3<- atatgttccc aaggcctgag TTCTGGACCT ACCCCCACGT GGTGTAAGCA 
25 

gaggaggaat tggttcactt aact ^ccagc aaacatcctc CTGCCACTTA GGA-33AAACA 960 

cctccctatg gtaccattta tgtttctcag aaccagcaga atcagtgcct agcgtgtgcc 1020 

30 C AGCLAAAT A G TTGGCACTCA ATAAAGATTT GCAGAATTTA ATACAGATCT TTT ZAGCTGT 1080 

TCTTACJGGCA TT AT AAATG j AAATCATAAC GTOCTTCTAG GTTATCAAAC CAT 1 3GAGTGA 1140 

TGTGGAGCTA GGATTGTGAG TiACCTGCAG GCCATTATCA GTCCCTCATC TGTGC AGAAG 1200 

35 

tcgcagcaga gagggaciat c:aaatacct AAGAGAAAAC AGACCTAGTC AGGATATGAA 12 60 

TTTGTTTCAG CTGTTCC 2AA AGGCCTGGGA GCTTTTTGAA AAGAAAGAAA AAAGTGTGTT 1320 

40 GGCTTTTTTT TTTTTTAGAA AGTTAGAATT GTTTTTACCA AGAGTCTATG TGGGGCTTGA 1380 

TTCACCCTTC ATOI^TTGGC TGGAACATGG ATTG3GGATT TGATAGAAAA ATAAACCCTG 1440 

ctttto;attc AAAAAAAAAA AAAAAAWAAA AAAAACTCGA 1480 

45 



60 



BNSDOCID <WO 



9B54PU3A2 1 



WO 98/54963 



PCT/US98/11422 



461 



10 



TGTCCTCTGT GGCGTTCTGG TGTACCCCTC TCTTCCT A' >C CATTCAGTCT CTCTAGTCAC 
CTCCCTAGTA GCTAGTGCTC TCT AA GTTTT TATTTAATTA GAA.C AACT - C ATTTCCATTT 
CAAGCTAGGT CAAP3C-GGGG AAAA.GCCTCA TGATTTAAA.C TGAACTTAAC AACACAGCTT 
TTAAAATGAA AACTCATACT CCAACTTCTA AAGTATATTT GAGCTCATTT CTTTCCAAAA 
CAAAGATATG CTGTACCTAA AAC-GCTAAA AC AAAAAT AT AAA 1 jAC AAGG ACTAG3TGAT 
TAAGGGC-AGA GAAAAATCAT YTCTTTTCCA GGAAACCTTT CCTAAAATAA GCAAAACTTG 
ANTCTATGCT TCATGGAAAC TGACACAAAG AAAAGAAACT GATGGATTGC ACAG-3CCTTG 
15 TTATAGAAAT AGATCTAT.AA AAAGATCTGT CCACAGGAAA TATACACCTT CTGCTGGTTC 
TGAACTTCAA TGGGGATTTG TCAOTTAGGT CTCCATCTAT AGGAATACCT TCACATACCT ■ 
ATCTATTCAT GGAGATATTC TGAAAACAGG TACATACAAA ATTACAACAA AGGAAAAAAA 660 
20 TTCTATTGAA CACTTAAAVA TAGAAACAGG CCAGGCACGG TGGCTCATGC TGTAATCCCA 
ACAATTTGOG AGGCTCA03C TOTCGATCA CCTGAGGTCA GCACTGTGAG ACCAGCTTGG 
25 CCAACATOGT GAAACCCC3T CACTACTAAA AATACAAAAA AAATTAGCCT GTGTGGTGGC 
ACACTCHTAC AATCCNGGCT GACTCGGGAA ATI 



30 



(2i INFORMATION FOR SEQ ID MC : 209: 



(i) SE2UEMCE CHARACTERISTICS: 
35 (A) LENGTH: 1779 base pairs 

(E) TYE'E: nucleic acid 
(C> STRANTEE-NESS : double 
(D) TOPOLOGY : linear 

40 (XI) SEQUENCE DESCRIPTION : SEQ ID NO: 209: 

AATTOCCAAG ACTGCACAAA ATTACAGTGC TAATCTATAT G3TTGCAGTT CACATAAAGA 

TGTTATGA-AA TGAGTAGTAA 7ATTCC,GTGG TTGATTTCTT CTTAGCAGAC 



45 



50 



120 

180 

240 

300 

360 

420 

480 

540 

600 



7z.O 
780 
840 
872 



60 
120 



TTO^TT^AT ^TTCCITCTTG AGATAAAATG CCCAGOATAA AT'DCTOTTTA TATTCACGTT 130 
TTCCTAGGTG TGTGTGTGCA GGCC ACAGC A GCATGCCCTT GGTGTAGTCA GTGCCGAAAS 
G^TGTGnC crrCTT^Gr CTCCCTCCAG OGATGGTCTC (TTTTTAAAGC aggttgtctc 

CAOCATTCAG TACACT^AAG GTAAGCTAAA CCATCAACAT CTC TGGTGTT TTAAGATGTP 3 60 



240 
300 



60 



WO 98/54963 



PCT/US98/11422 



462 



TTTGTTAAAT ATAGTTCCTA GTGACATAGA AACGATGCGT AGTTTTCATT TACTAATTAC 66 0 

AAA TGTTG AG GCC TAATT C T GAAAGTCCTC AT ATTT AAAG GCTAGACAAC GTAATGAAAT 72 0 

5 

TTTT AACT AT TTGTATGTCA TTTTGAAAGT GTACTGCTTT ATGGTAAAAG TGTTTTTCAT 73 0 

TTGTTCATTG TTTTCATTAT TTGT3ATCAT GTTGTCTTTC AATACAGGCA TAAAC CTT CC 84 0 

10 ACTC TTGAAC AAAGCAGCTG CTTTTTAAAA GCGGTAATTG CTTCTTTACC TTTTATTTCT 900 

TTTGTAAATG AAGCTTTTCT TTAAGAATGT GACTTTAAAG TGTTGTCTAT TGCATAAAAC 960 

AGTTGACACT CACTTATTGT AAAGTGAAGA TTGTTCTACT GCATGTGAAG TO j AC C ATGC 102 0 

15 

AGATTTCTGT ATGTTCTCAG TAT'^CATCAC TAGATAATAA AGTCTTTTGT GAACAAGGCA 1080 

TTTGTAGCCA TTTT* F AAAAG TTTTTGTCTT CAGTGCTGGT AAGTC AGG T A AACCATAAAT 114 0 

20 AGTT AAAAO 2 AACCTTTTGT TTTTTTCCTG AAAGTTTTTA ATTGAAAGTA TTATTAGTTA 1200 

AAG AT 3T AAA CCTAGCCAAA ATTACCAGTT TATTAATAAT TAGGATCCTA ATTATTTCAA 1260 

AAAATCCTAC AAATATTGTC AG2TTTCAGT GTAGTGAGAT TATTCCTGTA GGTTATGGGG 132 0 

25 

TATAATTCAG G ATTT AACT A AT 2TTTCTGC TATTTTCTCA CTTTTCCTTT TGATGGTGCG 1380 

GAAAGAGAAA AAGGAAAACG GGGCACAGGC CATTCGACGC CTTCTC CA AG GGGTCTGATT 144 0 

30 TGCTGAGACA CCAGCTTCAC CTTGTTAACA AGGCACCTAA TTACAACAAG CATGC AC ATT 1500 

TT13GTGCATT CAAGAATGGA AAATCAGAAT AGCAGCATTG ATTCTTCTGG TGCAGCTCAG 1560 

TO 3 AA.G ATG A TGACAACCAG AAGACATGA3 CT AAGGGT AA GGGACTGTTC TGAAGAACCT 162 0 

35 

TTCCATTTAG TGATCAAGAT ATGGAAGCTG ATTTCTGAAA ATGCTCAGTG TGTACTCTAA 168 0 

TTATTTATGG TAG CATTTG A ATTGTAACTT GC ATTTT AGC AGTGCATGTT T CT AATTG AC 174 0 

40 TTACTGGGAA ACTGAATAAA ATATGCCTCT TATTATCAA 177 9 



45 (2) INFORMATION FOR SEQ ID NO: 210: 

(L) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2110 base pairs 

( B ) TYPE: nucleic acid 
50 (C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 210: 



55 GCGGCCGCTG CAGCCCGGAG CTGAGCTAGC CGTCCGAGCC GAGCCGTCCG A ; 3CCGGGGAA 60 

GCCGGCGCGT GCTGCCGCTC GTGGCGGCCA GAGGAGAGGA GAGGCAGCAG CATGGCGAGT ' 120- 

GTCCTGTCCC GACGCCTTGG AAAGCGGTCC CTC CTGGGAG CCCGGGTGTT GGGACCCAGT 180 

60 



BNSDCX^D -WO 9BS4963A? I 



WO 98/54963 



PCTAJS98/11422 



463 



10 



GCCTOGGAOG GGCCTCGG2T 3CCCCA2CCT CGGA3CCACT G 3T AG AAGG 3 GCCGCTCCCC 
AG3CTTTCAC CACCT-IGAT GACACCCC3T GCCAGGA3CA GC * '2 C AAGG AA GTCCTTAAGG 
CTCCCAGrAC CTCG3G22TT CAGCAG31G3 CCTTTMAGCC TO3GCAGAAG GTTTATGTGT 

g3taggggg3 tcaagagtg: acacgagtg3 tggwgcagja cagctggatg gag3GT'3ag3 

TGACGGTGTG G2TG3TG3A3 CACAACCTG3 AGGTCTGCTG C AGGGTGG AG GAG3TGTCG2 
TG3CAGAG3T GCAGGGCCCC TGTCCCCA3G CA3CACCC3T GGAGCC03GA GCCGAGCCCC 

tgggctagag gcc3Gtctg: acgaacatcg atgtcccaaa gaggaagtcg -3ACgcatgga 
15 aatg3AT3AG aigatg3CGg gjatggtgct gaigtcgctg tc3tgcaggg ctgttgtaca 

GAGTCCTCCC GGG ACC C A' 3*3 CCAACTTCTG TG23TCCCGT G^XTXG AC CCA r IX3GAA 
GGA G AGTG3T CACATCTCGG ACAC^GGCAN CA3CACTACC ACCGGTCACT GGAGTO3GAG 
CAGTCGTGTC TCCAGGCCCT CGJCCCO ^CA CCCCCAC^GCC AGC C G C AA 1 3T AITTOGGGGA 
TGCTTT2G7T TCTCCCCAAA CT3ATCATGG CI V TT3AGACC GATCCTGACC CITTCCTC^CT 
OliACGAACCA GCTCCACGAA AAAGAAAGAA CTCTCT-SAAG GTGATGTA3A AGTGC2T3TG 
G3CAAA3TGT GGCAAA7TTC TG3G3TCGAT TGTG302ATC AAACGACAGG TCAAAGCCCT 
CCATCT' _V3*3G GA 3ACAGT<3G ACTCTGATCA GTTCAA : GCG3 GAGGA'GGATT T> 2 TAG T AC AC 
AGAG3TGCAG CTGAAG7AGG AATC^3CT«3C TGCTGCT3CT GCTGCTGCCG CAGACCCCCA 
GTCCGT-G:;GA GTGCCAGCTG 0CAGCCA-3CT CGCACCCCCA G3ATGACTGG CCTCCCTCTG 
35 TCTOCTCTTC CACCACCTCT CX2ACAAAGCC CAGTCCTCCG GCCCAGAACA TGCTGGCCCG 
GAGTCCTCCC TGCCCT 2A 3G «3GCTCTCAGC AAGTCAGCTC CTGGGTCCTT CT<3GCACATT 
C A 3G 3AGATC ATGCATAGGA GGCTCTOCA TCCTTCCAGA T2CCAGTCTC AC C AC ACATC 
TACACCAGTG TCAGCTGGGC TGCTOCGCCC TCCGCCGCCT GCTCTCTMTC TCGGGTCCGG 1440 

agccsotcgc taag^gag cg~vvk:::cca gcagccagca gctgcgatga aatctcatct isoo 
45 gatgg'I'.-act T':g;ga3ggg gv;- ;g:-agag TGGTorcASG aaa-3CCCGag <x3GA3gctaa 15-60 
gaagtg2CGg aagi^atcg catc;a3cag gg3Ga:ca.t ostgcacggc :tgccc^cg 1620 

AAGAAGG2CT G7CAGCGCTT TCTG3ACTCA -CTGT3CT3C AGGTTCTACT CTGTTCCTGG 

CGGTG:GG3C A3C2ACTGAC AA3AG3CCAG TCTGTCACCA .3CCCTCAGCA 3AA.ACCGAAA 



10 



25 



30 



40 



50 



24 0 

3 00 

360 

420 

480 

540 
600 
660 
720 
780 
84 0 
900 
960 

1020 

1080 

1140 

1200 
1260 
1320 
1380 



1680 
1740 



GAGAAAGAAC G3AAAGAGG3 AGTTTGG G7 T CTGTTG3CTA AGGTGTAACA CTT AAAG 2 AA 18 00 



60 



":atc 



WO 98/54963 



PCT/US98/11422 



464 

AAAAGAAGGA ACAGCTCGTT CTGCTTCCTG CTGAGTCGGT GAATTCTTTG CTTTCTAAAC 2040 

TCTTCCAGAA A 3GACTGTGA GCAAGATGAA TTTACTITTC TTAAAAAAAA AAAAAAAAAA 2100 

AAAAACTCGA 2110 



10 (2) INFORMATION FOR SEQ ID NO: 211: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 8 base pairs 

(B) TYPE: nucleic acid 
15 ( C ') STRANDEDNES S : doub 1 e 

(D) TO POLO 3 Y : linear 

(xi ) SEQUENCE DESCRIPTION: SEQ ID NO : 211: 
20 GGCA^GGAA AAAAAAGAAA AAAGAAAAAA GAAAAAAGTT TTTGTACCCA CAG ATT AGC A 6 3 

TTTTCTTGAT GTTTGAAAAA AGTTTAAGOT ATGTCCTAAT TTAAAAATGA GCACAAACTA 12 0 

CTTAACA 3 AT GTCTGTTCCC TCTTCTCTTA CTTAAATTAT CTTTATTTTC ACCATCACCT 180 
CCCAGTG2CG AACACCTGAN CTCTGTGTTT TGTGGTTGGA TCCTGGGTTG CCAAGTTCCT 
ATTTGGTIAG TCCCTGGCCT GT303GCG3T CTCAGGAAGT GGCATGCTCT TC AMG RAGG A 
30 TCGTTCATYT CCAGTATAAC C AWTTTGT T A ATAATAGTTG ATAATTCCCA GCTTTT ACCA 360 
GATGAFTTTT G AC TT ATTTT TCCT<rCTTTG ACCTGTTCAA A3CTAACATA TCTCGGTCAG 42 0 

TTOGGA3AGG GTGGGGGATT TGAGAATGTG A 3G AGG AGTG GGGTTAGAAT GGGTTTGCCT 



25 



35 



45 



240 
300 



480 



ATCTG03CAA GGAAAGAGTT CCTAGTCGAT TGGGCACAAT GACAAAATCA TTCCATGGAT 54 0 



600 



AGAATCGTCC CATGTTGCTG GAACACCTCA CGTGTTGTGA ACGCCTTAAA TTCCTGCCAT 
40 CCCTTCTCTG ATTCCCCACC TCCCTGTAGT TTC CACAGG A TTTATCTCTC TGTACCCCCG 660 
TCCTCCAACT CTACTCTGTC AGCCTCTCCT CCATCCCTTA CTTCCCTTCT AAATTCCAGG 720 
AGATGAC C T C ACTTTGCAAA GCAAATTGGA GCCACCAAAT TGTAGCTCTC CTCGGTGGAA 
ACTGCATCTG TGC:TCATCCC TGCACCTTCT TGCAGAAAGC CGCCCCCTCA GGCCAAGATG 
AGTGCCT3GC CCCCATGGGA GACTCAGACA CTTTGACCCC TTGTGACTTC AGCATCTCCC 



780 
840 

900 



50 TCTTTAAAGA TTCTCTCCCA ACATTCAGTC GTGCTCGA 933 



55 (2) INFORMATION FOR SEQ ID NO: 212: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1551 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRANDEDNESS : double 



WO 98/54963 



465 



PCT/US98/11422 



10 



15 



25 



[t) TOPOLOGY: linear 
(xi) SEC-UEHCE DESCRIPTION: SEQ ID MO: 212: 
A3GCTGGACT AAGCATAGAG AACCAGGACA GAAAGAAAGA TTTAAGAGAC TG ACT AAT AT 
TTTTTGACAG ATCATTTAAG AAACTGAGTA ATI TTTTTTT TCTCCAAAAG GGCATGGGTT 120 
TTTTTTTTGT TTTGTTTTTT CTCTATTTGG CACTTTCTAG GGATTGGTCT ATAAATTTTT 
TGAAAGATCA TAG" 3AT AAAT TTCTTTGTAG CAACTTCCTA TTTTAGTGTT TAT GTTAGGG 2,0 
CARCCCCARG TGT^CCTGCT GATACGCCAT TAGGGCCACT TCTCAGCCTC TGG2TACATC 
ATAATGCTTT TTTTTCTATC TTGCCAAAGT TT2CMGAAAA TTKAKGTTTT CTAATTTTAA 

^,„„ Y ., r . r . TCTTTATAAG CCCTGAAAAT AAGTCATTTH 420 
AAAAATTGGT TGTGGAGATG GjATjGGACC rCTTlAl.^ 

TTTTAACTKX: TATTCTGCTA TAAACCTGAT TCTCACTTTT TTCTGTAGAC AACAGTTTTT 
T AT AATAT AT C T ATTTTGTG TGGACATTAT TTCCTTTTAA CCAATACTGA AA-CCATAG 5*0 
TGTAWACTTT CTGCACATTT TCTTTGATT A ATACTTVCXT AAAATAGACA CTTC-GATTGG 
CACCAGCTGT 7AATAAA GCTGCCCTGA ACATTGTCAA TCAATCCTGT TAACCAATTT 

CAGAATTTTT C^GAATGCT TAGTTAGGGA TGAAATTGCT GGGTTATAO. TATGAGTATC 
CrPGATATAC TTTT^G AATGTCTACA CGTGTGTGTA CACCAGATCT CCAGA.GATAG 7 3 0 
30 GGGAATCTTA TGTCCCTGCT AACTGCTCTC CTTATTTAAT TTTCTGACA.T TTGCCGCCGC 8,0 
CGCCGC2CCC T-3CCCCCAAC ACACACATOG TATAAAGTGG TAGTTTCTTG rrTTAAA™ 

35 AAcrrrrcAA tgatttcaat ttgggcattt ct-ttgtatcc tgagttattt txtitcccg 

XT AT 1 GTG AAT AT TCTTTT C C TATGCTTTAA CTACTTTTCT AATTTGTCCC tTTTTTTKjGT 1C20 
TATCAAATTC C AGGCC ATTG TCTATTCCAT CGTCACTTTT GGGTATTGGA AACATCTTTC 1C30 
40 CATTC^TAG CCTGTCTGTT GAACATAAAT CTTGATTTTT ATGTAATCAG ATTTITCr*: 11,0 

„ , ^rr^rr^prrvn "-'"T 1 ATCCTG A CiACCACAAAA 
TT ACGGTT AT GTTCTTGGAA TTTTATTTAA OA^TCTTI T ._A1_ 

45 ATO tccccac — rm-TTC tgtttcatag ttttg^ctt-. * — 

ATGTGTAGTT rATTTTATAT GCTTTGAAAT AGTTCTTATT CATTTATTC*A ACACATATTv 
CTOGAG^CC TGCTGATGGT AGTACTCTTC AGAGTACTTT GTATATATTT GTGAACACAT 

50 ™, .^.^irar rrN^ACT-^fX; TTTCCACCTG 144 0 

ATT C TTGC C C TO'.; AA l IX17TT A TGTTCTCCTT CAA^TAGAT <_CNTA-l- 

,^. T r-AOGAT G AATTC CAC A ATTTT AC AC A TAGCACCAGT TAAGGAATAG 



1200 

1320 
1330 



1 ^ '--j u 



60 (?) ir:FCFr-^Tio:J f ?. sfq ii: 



WO 98/54963 
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466 



( i ) 3 P: yjE2 ICE 7 HARACTEF I STICS : 

(A) LENGTH: 997 base pairs 

(B) TYPE: nucleic ac:i 

5 (C) STRANDEDNESS : double 

( D ) TO POLOGY : I i n e ar 

(xi) SEQUENCE DESCRIPTION: seq id ::c: 212: 

10 AGAGAGTCCT CAA2AGAACC T AATCATGCT GGCACCCTAA G^TCATATTT TTAGCCCCTA ~0 

GAACTGAGAG AA2ATAAACT CCAGTTCTTT AAGCTACCOA 2VCTA7G3TA TTTGTTATTA 12 0 

TAGCCCAAGC TAAGTCAGGT GGAAAGGCAG AAAT A 7TT7 G AZLAATLARTCA 7T TCT ACAAA 13 0 

15 

AAGAGAGTTG TT2TAAAT3A AATGGC CA,G A TA7TCCAT7T TTTTGAT.- Ao-TAGTTA.G 2 4 J 

AAAGTTTCAT TAAACACCA7 TTGGGCA3CA CCCAGGCC7G CCAC7G7CAG AACGGCAAAC 30 2 

20 AAAAQ 2 AAAT GATTTGAGGA ACAAAAGAGT < 7GACA22 AG AG CGTCTCAiGAA. GATGG2727 2A 3 6 0 

TCTTCTGAGA TGATCTTCT3 AGATCATCAA TTTTCTOCAC CTGA7GTCCT ACTCCAA7TG 42 0 

TAGTAGATAA G AGCAAAG A ■ 2 ACTTCGTGAT CCTGCGGAAA A77GC7 GG A3C 7CTGCTGA7G 4 30 

GAGAG 3GTGA CACTGOTACG AACAGAAGGC CGGACATT7A 7777G 7TGCAG 77CTTCTG7A 54 0 

CCTGGGCCCT CTT CAGG2CT TGTACCTTGC ACTCCGCA7G C 2372*7 GTAGC A 2 G TGGT AAG 60 0 

CTGAAGTTAG < jTATTTG AAG AGATAATTTG CCCCCAACAA 337AA7TAC~7C AAAAGAAAAA 6 50 

GGAAAGCACT AAATTCCACT TGACAAAC3A GTTTGTTCAG 7 T 7227G.-.7TG r 7 7 GC AAATTTG 72 0 

AAACTTTCTC TTTGGCACCA T AT3ATTC 7"G TTACATTAG3 C-CTT CA7CAAT 3CTAAGA7AC 78 0 

ACAGCTAGGT 2TACCAGCTG CCAGTO 0TCA AGAA7G AA7 3 AACTTCTOTTG 3GAGA3A77A 3 4 ] 

GTTTCTAATA ACCTAACAGT TTTCCTTGG3 TAITACMAAA AAAAAAAAAA 7TAGAATAA7 9 7 0 

40 ATGT CAGTGC 2AT3CAGGCA A3TACAGATA TGGAAATGAA AC-CTGTGTCT ACAACTC-7AA 96 0 

GATTTGTTTG TTAATAAAAT TGATT03GAT CACTCGA 997 



25 



35 



45 



(2) I NFORMAT I ON EOF SEQ ID NO: 214: 



(i) SEQUEI JCE CHARACTERISTICS : 
50 (A) LENGTH: 14 96 base pairs 

(3) TYPE: nucleic acid 
(2) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID *TO : 214: 

G AATTC GGC A CGAGTGACCA CAG ATATCTT TGO<7TTTCA3 GCTCAC GAGA ATGCTOTC CA zO . 

CTATGTTTTT TTTAATCGAT TGAC AT 7TCA TGAATCCACA AATT7AGCCG '2TTTTCC-.TC 120 

60 



BNSDOCID - WO .98 54963 A 2 



WO 98/54963 
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467 



10 



20 



30 



130 



TTTTCCATCT TTGTCATAGC TTCATCACGC ACGATGGAGG TCA 3TTCAGC ACTATCCGGP 
GCGGCCTCAC ( 3GACA r - ; ATC P. GTGAATTO XT TTTTCCTTTT TCTTGATGTA CCGGATTGT 7 240 
G ACT 7GTT AA Z ATTG AGCT' 1' ATGGCCAA 7 A GCACTGTAAC TCATGCCT3A TTGGAGCTI A 
TCCAACACGC GGAI^TC CGTAA3GSAM ATCAMGGT 3T T CTTT 7GCTT A^AACAC'IC, 
GGCARARCTT AARCACTAC3 CTTG3GGGCC ATTTT AG AAA GCAAAACCAC CCA3AAAAA3 
C AG AAAAAAA AGT 3TCAGTA AACAGA3TGN NGANAG3ACT C TTTGTTT AC AGCACA-GGAG 
CG^.GACTAG AA03C30303 TTCTCCCCAG TTCAAACTTC AG^GGAAC CTTAGCTCC, 
15 CCAACTCCAA ATTTTCACCC TCTGCCOAT3 CCCGGGAAAS AAACCCCCAG AAC AGT AC- 2G 
TGATCATTGA TTT T AGC jGTT AGAAATACAT TTTA< X'AAGT AA3TGAATTT UXATTACGA 
VTCAATGATT AATGAACX-rC ACCTGTATTT C Z AT AG A' V AT GTAATTTTAT TT;^<30AG3T 

tt a rrATATT aagoog,/ a ggcagogxg aagagta::aa gtitc-agcat o::ACOOCGrc 
c^g:;coggtt cggocto<o:a gcga3GGGtt cagggacgcc acxtccggagg 3atc3GC 

AAGT3TCG7A GGGC AAC 2 AC GTAGTACTCT CTGCGCATGT GCAAAGCGCT GTCGGGGOJC 
GCCCTAGCTG CC0-3CCCG3 CCCCGGGGCT CTAT3GTCTC TCCCTAGAGC TTT' 3CC ~>TTG 
GAGGCGGCTG CT3CGGTC TT Cr^AGTTTGA CCAGCGTCGA GO C- 3C AGC AA CATGGA3GAA 

TTCGACTCCG AAGACTTOTC TACGTCGGAG GAGGACGAGG ACTACGTGCC GTCISGGTGAG 
CGATTCCGCC T3A OV7GACA AC^^AATIGC CCO3CCO0AC GCCTCACGTG AGGOGCGCTC 

35 to:ccccgcg goc otctvxc ctcigmcca ogtggtgcag ogoogotcct gttotcgagc 

GT3CGCTCCC tca;:«goo::c:t catcctcg3C cgctcosgcc cgaggogtgt ozczgt^azg 1260 
gttctgtgct cccct::c7gt tggtcatctc cggccgccgc cg;:tcttgc aocgogggaa 1320 

40 COGCACATGG A2ACGGCCCC TTGT3GCTAG GGACGCTCGT CG3TCAGCCC C3AA03ACAA 1330 
CCCTGCTTCA GAAOTCOGGG CGGCAGTTCG AG03TTGGAA 3TT r n v TTTCA CCCCT0O3C0 1440 
45 GAGAOAGGTG O ^ 7,A ;A ACCCGTC2AA GAT AG AGG TG TCCGrrTCTCC GNCTOG 



300 
3 6 0 
420 
430 
540 
600 
660 
720 
780 

'GG 84 0 

900 
960 
102 0 
1080 
1140 
1200 



50 (2) INFORMATION FOR SE} ID NO: 215: 

(;) SEQUENCE CHAPACTEFI3TICE: 

(A) LENGTH : 13 08 base pairs 



WO 98/54963 



PCT/IS98/11422 



468 



15 



CTGCCTTTGA CCCATCACAC CCCATTTCCT CCTCTTTCCC TCTCCCCGCT GCCAAAAAAA 12 0 

AAAAAAAAGG AAACGTTTAT CATGAATCAA CA3GGTTTCA GTCCTTATCA AAGAGAGATG 180 

5 

TGGAAAGAGC TAAAGAAACC AC Z CTTTGTT CCCAACTCCA CTTTACCCAT ATTTTATGCA 24 0 

ACACAAACA: TGTC2TTTTG GGTCCCTTTC TTACAGATGG ACCTCTTGAG AAGAATTATC 3 00 

10 GTATTCCACG TTTTTAGCCC T C A 1 3GTT AC C AAGATAAATA TATGTATATA TAACCTTTAT 36G 

TATTC5CTATA TCTTTGTGGA TAATACATTC AGGTGGTGCT GGGTGATTTA TTATAATCTG 42 0 

AACCTAGGTA TATCCTTTGG TCTTCCACAG TCATGTTGAG GTGGGC T C C C TGGTATGGTA 4 80 

AAAAGCCAGG TATAATGTAA CTTCACCCCA GCCTTTGTAC TAAGCTCTT"; ATAGTGGATA 54 0 

TACTGTTTTA AGTTTAGC2C C AAT AT A GGG TAATGGAAAT TTCCTGCCC 2 CTGOGTTCCC 60U 

CATTTTTACT ATTAAGAAGA CCAGTGATAA TTTAATAATG CCA :CAACTC TGGCTTAGTT 66 0 

AAGTGAGAGT GTGAACTGTG TGGC AAG AG A GCCTCACACC TCACTAGGTG CAG AG AGC C C 72 '.) 

AGGCCTTATG TTAAAATCAT GCACTTGAAA AGCAAACCTT AATCTGCAAA GACAGCAGCA 73- 

AGC ATT AT AC GGTCATCTTG AATGATCCCT TTGAAATTTT TTTTTTGTTT GTTTGTTTAA 84 ■> 
ATCAAGCCTG AGGCTGGTGA ACAGTAGCTA CACACCCATA TTGTGTGTTC TGTGAATGCT 90'J 
AGCTCTCTTG AATTTGGATA TTGGTTATTT TTTATAGAGT GTAAACCAAG TTTTATATTC 95 ) 

TGCAATGCCA ACAGGTACCT ATCTGTTTCT AAATAAAACT GTTTACATTC ATTATGGGGT 102 0 

ATGTATGACC TTCATTTT2C AAGAAATAGA ACTCTAGCTT AGAATTATC-3 ATGCTCTAAA 108:) 

ATGTCAGAAT GGG AACT 2TC CTCGAACTTC TCCCAAACTC AGAGACAGCA CTGCCTTCTC 114 0 

CTAAATGATT ATTC TTTTC T CCCTGTTTTC TGGTATTTTC TA3GCATCCT TCTCACCACA 1200 

40 GCCATAACCC TTTTTTACTT 02ATTAGGCC GTATAACTGG NGGGACNGCT GGTCGGTATA 126 0 

TAATACTGGT WCCAACAMAG G3GTTCTGGA TGTACACMAG GTTATCTT 13 08 



25 



35 



45 



(2) INFORMATION FOR SEQ ID NO: 216: 



U) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 17 05 base pairs 

( B) TYPE- nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

55 (xi) SEQUENCE EFSCRI PTION : SEQ ID NO: 216: 

TGGCCATSGA AGCGCTAGAA GGTTTAGATT TTGAAACAGC AAAGAA* ^G AT TTCCTTGGAT - 60- 

CTOGAGACCC CAAAGAAACA AAGATGCTAA TCACCAAACA GGCTGACTJG GCCAGAAATA 12 0 

60 



BNSDOCID ■ WO 9854^63 A 2 I 
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10 



15 



469 



TCAAGGAGCC CAAAGCCGCC GTGGAGATGT ACATCTCAGC AGGAGAGCAC GTCAAGGCCA 
TCGAGATC ? 3 TOGTGACCAT GGCTGGGTTG ACATGTTGAT CGACATCGC: CGCAAACT3G 
A-JAAGGCTGA GCGCGAGCCC CTGCTGCTGT GCGCTACCTA CCTCAAGAA3 CT3GACAGCC 
CTGCCTA1GC TGCTGAGACC TA~CTGAAGA TGGGTGAC "T 0\AGTO"TG GT3CAGCTGC 
A3TGGAGACC CAGCGCTGGG AT3A3GCCTT TCCTTTGCOT GAGAAGCATG CT G AGTTTAA 
GGATGAHAT 2 TACATGCCGT ATCl'TCAGTG GCTAGCACAG AACGATC GCT TTGAGGAAGC 

c:ag;,a,v:,;;g ttc:a:aacg c^ocgaca gaoagaaocg gtc-~acgtgc tcgaccacct 

C ' AC AAACAAT GC ^GTGGCGG AGA.'L-C'AGGTT IYwVIVA^ 'l' GCCTA1TA1T ACTGGATGCT 
viTCCATXAG TO3CTCGATA TACOTCAAGA TOC^GCC :AG AAGGACACAA T~CTTGGCAA 
:-TTCTACGAC TTCCAGCG'PT TCCCAGACCT GTACCATC-GT TACCATCCCA TCCATCGCCA 

GAC03AAGAT ccgttcagtg tccatcctcc tcaaact?tt ttcaacatct ccaggttcct 
■-vrTO^ACA :;c ctvoggaa^ AOACrCCCTC GGGC^TCTCT aaagtgaaaa tactcttcac 
25 cttogccaag cagag::aa3G ccctcggtgc ctacagoctg gccggo:agg cgtatgacaa 
ggt:;cgtg:;c c-gtacatcc cp;:cagatt ccaaaagtcc attga3ctgg gtaccctgac 
catccgcocc aa^ccttcc acgacagtga ccagttogtg occtktoct acccctgctc 
caccaacaac ccgctgctca acaacctggg c:aacgtctcc atcaactcc oicagccctt 

CATCTTCTCC OOGrCTTCCT ACrACGTCCT AC AG C T 1 3* 3TT GAGTTCTACC TGGAGGAAGG 
35 GATCACTGAT GAAGAAGCCA TCTCCCTCAT GGACCTGGAG GTGCTGAGAG CCAAGCGGGA 
TGAGAGACAG CI ACIAGATTT G^AACAACA -3CTCCGAGAT TC TTCC G-G C T AGT03GAGAG 
CAAGGGACTC CATCGGAGAT tiAOJAJGCGT TCACA3GTAA GCTRAGCTTT GAGOAA3GTG 
GCTCARAGTT CGTG2GAGTG GT3GTGAGGC GGCTOGTGCT GCGCTCCATC AGG CG3CGGG 

at3Tcctv;at caag::gaik^ gcg::cacccc ^ac^tggca atacttccg:: tcactctgc 



30 



40 



45 



50 CTGTGAAGGA 3AATAAAGAG TTAAACTGTC AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

aaaaaaaaaa aaaaaaaaaa aaaiia 



180 

240 

3 00 

360 

420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1330 
1440 



ciga::gc::tc cattaccatg 7 >::c^r o~~tccagat cttc:at7ct ^g-gactatc if.oo 



1S60 



: "« T ,- ACCAT GX"PG3^3C'3 CCTACT3GCG CAGG FCX7AAG 3ATGACCCTG 

_ T ~ ^ rr ,^ T -rry^ .".-q-c'TIOGG 3TCTGGTGGG 162 0 

GCGCATGACC A3GATCCTGG GGA'-Ga^-CT^ l^Cl.T^^ -^ U1AAJ 



1680 

170S 



60 
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(A) LENGTH: 9 99 base pairs 
(P) TYPE: nucleic acid 
( O } STRANDED!' TES 5 : doubl e 
( D ) TOPOLOGY 1 1 near 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217: 
AGCAAATCAC CTTAACGATC TGG AAT< 3AAA CTGTGACCAG TGCCGCCCTG GGTGGTTCTG bO 
10 GAGAGACTGC CGTCTTCTTG TTTGGOIATA Q3TGCTGGGG CCCO0GCTTC AGTCACTGTC 120 
T CAG AC AG K A GTCCCGATAA GOAGAT2ACC AGTCCTCCAC TGTCCTTCCT GTCG3CCTTG 



15 



35 



ATTGAGTGYC TT AAT AAT AG TYTACAAATA CTATGTATTT AT GC AAAACT GTTAAAGTTC 



20 TGTCCTACTC CATACGTOTT TATCCTGCTA TGCATTTTAC ATTGTGTGTT CAC AT CT ATT 
CCAAOGAGCC TTOCTAGAAA C AACA 2TGGC GGTTCCTGOA GGC CAGGC AG GCAITGGCCC 
atgctgtgt: CCATAGGAGC CAATGGAAAG AACGTAGCTT GGTCTGCTAG CCAGCCGTGG 

25 

GGTGGCGCA3 GCCAGGCAGC CTCTGCACCA GAGTCCAGCA CCTGC2CATT CCCCAGTCAC 
ACAATCATAC TCTTOTTTCA TAGAGATTTT ATTACCACCT AGACCACCCT AGTTTTCCTC 
30 TCTGTTAGTG TCCT^AGGTC TTTTGCAACA AAATGTAG0T ACAGACCAAT CCCTGTCCCT 
TCC CCAAT C A GGA 0CTCCAC ACCATGAGTT GTTTGGTTTT C C AGAAGCTG CCAGTGGGTT 
CCCGTGAATT GOG ITAAGAT ATCGATGATK TTTTTTATTG TTTTT(:TT(:T TGTTTTTTTA 
AAT AAT AT AT TTAAAGGCAG TATC TTTTGT ACTGTGAATT T3CAGTAGAA GATGCAGAAT 
GCACTTTTTT TTTACTTCTG ' VT< 3CTGTGT A TTGTATATAG TGTGTGTGCT TCTTGTGATG 



130 



CTGCATGAGA A<0ATAGCTGC TTCCTCCCTC TTTTCCTACA CTGTAAATTA TTGTTTTACA 240 



300 



TCATCTGTTA T GA/TTOGAT A CTTGCtTCTTG TCAGTAGTGG TCAGCATTGG GTTGTGAGCT 3 60' 



42 0 

4 8 D 
54 0 
6 3 0 
660 
72 0 
78 0 
84 0 
900 
960 



40 AAAATAAACT TTTTCTTTAT AAAAAAAAAA AAAAAAAAC 



45 (2) INFORMATION FOR SEQ ID NO: 218: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 941 base pairs 
{ B) TYPE : nucleic acid 
50 f C) ST RANDE ONES S : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 218: 

55 GGCACGAGTA OZGYTTTCATT TAATCTGCAG GTATATTCTC CCAACAGTTT ATTGTC ATGT 

GATGTCCTCA GCCAAGATTG T RAGGC AG AG AGGAGCTGTC CCAACCTACT AT AC C AC C GA 

GGCTGGAGAG ATCATATTTT TGGTATTAAA CTGGAGTCTC TCCATCCTTC AC ATTG TTG A 

60 



BNSDOCiD - WO ?eM r -*3?A? 
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10 



15 



20 



471 

tgtcctstgt agoaaaccgs aaaagtcagt gagagaagat gc-otagcg gtttgagcga 

GAGAATGAGA GGTGTGGTTT GG AG AAAAGG GC CGG ATGGT G GCT G T AG AA AGCCCATCCT 
TCT GGTGTTC TTTTTTCTCC GGCTTATATT GTGCTTTCAT TGATTCATTG ATTCATCAAA 
CATTTGTTGA GCACCTATTA TOTCTCAAGC TCTGTGCTAG CCCCTGGAAA ACCTGCCCTC 
ATGTAGCTCA CTGTGGAGTA GGA 1 GAAAC AA TGACTAGACT ATGATAAGCA GGGGTTGTCA 
GGGTGTGAGA G AGO AGTGG- 2 C CC TG ATG G A GAG GGATGAG GT 7AAAGAAG GC ATCC AGGC 
GACJGATGGTG TCAGAGCTAA CTCAAGAATG AGAG3GACCT GOACCASCAG OGOTTGGAAC 
T< 3AAQ jTGGC agtgccto.a GTCTTGA^C CAGGAGAGGG AGAGCAGTCT GTGAAAAGGC 
ACC AA» 3GGTG C^^GOGCA GAGCACAT3G AGGAACTTCA QVrAGTTCTG GATGGCSCTG. 

,, fraa ™ rAAATGT^CC T-GAGTTACA TGAACTTCCA 
G3GCAAAGCT AGAGA-jGTmA ^^GAAT'.TA t^AA^ 1 *^ 

TCGGAATAAA G G C ATTGG AA AGGAAAAATT TAAGTGAGAA GTGCATTTAA GCOTGGTCCG 
AGTAGAATGA TmTACAAC GAATTGATGA GAAGGAGTTA CAGATGTCTT TGTTC GTTCT 
CCACTCCCAC TG2TTCACCT GACTAGCCTT TAAAAAAAAA A 



240 

300 

3 6 0 

420 

480 

540 

600 

660 

720 

780 

840 

900 

941 



30 (2) INFORMATION FOR SEQ IE' NO: 219: 

(i) SEQUENCE CHARACTERISTICS: 

•A) LENGTH: 57 5 base pairs 
(3) TYPE: nucleic acid 
35 (C) STRANOEDNES5: double 

(C) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219: 

40 T AAGTGOAAT C 7CCCGSGGT TG2AGGGAAT TCGGOACGAG ^ATTCIGAG AAGCTTAAGA 

C AT ACTTTG A AGAGAACCCT AGGGACCTCC AG7TSCTOCG OCATGACCTA CCTTTGCACC 

ccgcagtggt gaaggcccac ctc^::catg ttcctgacta cctggttcct cctgctctcc 

^ GIXtGCCTGGT RCOCCCTCAC AAG AA 1 ' ICQ'* ? A AGAACCTGTC -'COO-TOT AGGAAOGOCA 
AG AG Pit 3C AAA GFCCCA^AAC CCACTGCGCA G2TTCAAGCA 2AAAGGAAAG AAATTCAGAC 
50 GGACAGGGAA GCCCTCCTGA CCTCTCTG0A GOTGAGCACA TTGTGGAGCA 

CAGGOTTACA CCCTTCCTCW ASAGOGGPvGG CTCTGGTGCT TACTGCACAG CGTGAACAGA 



60 
120 
130 
240 
300 
360 
42 0 

.1 'A - 



60 
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472 



10 



30 



35 



40 



(2) INFORMATION FOR SE2 ID NO: 2 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 3C13 base pairs 
lB) TV PE : nucieic acid 
{ C ) STRANDEENESS : doubl e 
(D) TOPOLOGY: linear 



(xi) SEQUENCE E ESCRI PTIGM : SEQ ID NO: 220: 

GCCA3CCTTA CAGGTTTTAC ' 3T 3 AAAT 3 AA A3O0ATTGGA ATAGAACCCT CGCTTGCAAC 60 

15 AT AT ' 3 ACC AT ATTATTOOC : TGTTTGATCA ACCTGGAGA0 CCTTTAAA3A GAT'OATCCTT 12 0 

CATCATTTAT GATATAATGA ATGAATTAAT G3GAAAGAGA TTTTCTCCAA AG3ACCCGGA 130 

T 1 3AT GAT AAG TTTTTT CAGT CAGCCATGAG 'OATATGCTCA T3T0TCAGAG ATCTAGAACT 24U 

20 

TO0CTACCAA GTACATG3C3 TTTTAAAAAC CGGAGACAAC TGOAAATTGA TTG3ACCTGA 3 00 

TCAACATCGT AATTT C TATT ATTCCAAGTT GTTCGATTTG ATTTGTCTAA TO3AACAAAT 3 60 

25 TGATGTTAC2 TT3AAGT3GT AT3A3GACCT GATACCTTCA G3CTACTTT0 3CCACTCCCA 42' > 

AACAATGATA CATCTTCTGC AA<G3ATTG3A T3T03CCAAT CGG3TA3AA3 TGATTG3TAA 430 

AATTT G 3G AA. AGATAGTAAA GAATATG3TC ATACTTTCGG GAGTGAG3T3 A 3A3AA>3A;3A 54 0 

TCCTGATGCT CAT3GCAAG3 GACAAGCACC CACCAGAGOT TCAG3TGGCA TTT3CTGACT b0f.> 

GTGCTG2TGA TAT'OAAATCT G3GTATGAAA G3CAACCCAT CAGACAGACT G3TGAG3ATT 660 

GG3CAG2CAC CTCTCTCAA3 TGTATAG3TA TCCTCTTTTT AAGGG0TGG3 AGAACTCA3G ^2 0 

AAG3<3TGGAA AAT3TTQGG3 ■ 2 TTTTC AG 3 A AG 3 AT AAT AA. GATTCCTAGA AGTGAGTT-G3 73 0 

TGAAT3AG2T TATG3ACAGT G3AAAAGTGT CTAACAG3CC TTCCCAGG3 3 ATTGAAGTAG 840 

TAGAGCTGG3 AAGTGC CTTC AG3TTACCTA TTTGT3AGG3 CCTCACCCA3 AGAGTAATGA 90 0 

GTGATTTTG3 AATCAACCAG GAACAAAAG3 AAGCG CT AAG TAATCTAACT 'GO ATTG AC CA 9b 0 

45 GTGA3AGTGA TACTGACAG3 AG3AGTGACA G3GA3AGTGA '2ACCAGTGAA G3CAAATGAA 102 0 

AGT- 3GAGATT CAG3A3CA3C AATG3TCTCA C0ATA3CTO3 T G 3 AATCA'OA CCTGAGAA3T 108 0 

GAG AT AT AC G AATATTTAAC ATTGTTACAA AGAAGAAAAi 3 ATACAGATTT G3TGAATTTG 1140 

50 

TTACTGTGAG GTA3AGTCAG TACACAGCTG ACTTATGTAG ATTT AAGC T 3 ' 3 T AAT ATG2T 1200 

ACTT AAC C A' r CTATT AAT 3G ACG ATT AAAG G3TTA3CATT TAAGTA'GOAA CATTGCGG r rT 1260 

55 TTC AGACA 3 A T03T3AGGTG CATGGCTCTT GTCAT i 3A , 3GA TAAGO3TG0A -OACCTAGAGT 1320 

GTC3GTGAG3 TGACCTCACG ATGCTGTCCT CGTGCGATTG CCCTCTCGT3 CTGCT3GACT - 133G- 

TCTGCCTTTG TTGGCCTGAT GT3CTGCT3T GATGCTO3TC CTTCATCTTA GGTGTTCATC 1440 

60 



WO 98/54963 
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CAGTTCTAAC ACAGTTGGCG TTGGGTCAAT 
AGAAAT AAC G CATCTTAG3A ATGACTAAAC 
5 GTAAAATGAC TGT AG AT AAA TGTTGTAATT 

gccgctgc:;a tagttttcta acttgaacag 
ttgtctatag ctgttacc7a ttttagtggt 

10 

GT03AATGTC TTCTTGACAT CATTGTGTAT 
CTAGCAGCTC TTAGCACTAT GACTTAAGTG 
15 GCATTCCTGC C TTC AT0~ AQj GCTTCTACCA 
ATTT A 3TTAG GTACCCCACG AGTCGTCCAC 
TTGACAATTA TGGGATACTC TAGTCTACTT 

20 

ctcaaggc:a ggacccaotg atacatcctt 

TGGAATGT3A GGTGTTAAGG CATTTAATTG 
25 TAGCTAGGTA CTTAAGCATC CATGGGTATT 
AAAGAGTAAA TTAGCCTTCA GTCTTGGTTT 
AAATGTTAAC TCGGTCCCTT CCTGTCTCTA 

30 

TG7TA'GCCTT ACTATTCAAT ACAGTCCTTA 

acctattctg aatcaccatg ttgctctgca 
35 gtagatgocc tatgaai^ttg tagtagactt 
tttaagtttg t g 3 tag ag at cctccaaacc 
tagagaaaaa ttaaggcc:tc ACAGGATGAG 

40 

atagtcttta gcgtctaact atg ag t aaaa 

ac: 3tgtaagg tcagcacttt q:ogaggo\ga 
45 gagagtagcc toggcaacat agt^gacac 
ggtggtatgt at 7tgtgtgg cao:7taattg 

TAGGAGAG3G AGGTTG2A 1 jT GAGGCGTGAT 

50 

AGCAAGA ICC T GTCTT 1 G 3 AG AAAGG AGAAT 
GG-C TCATGCC TGTAATCC 



473 

AGTTT2GCAA TTTCAGGATA TTT2GATGTC 1500 

AAGATAATGG CAGTTTAGGG TGCACAACTG 1 r - r 

AGTGTACACG TTTGTATTTT TGTTAATATA 162 0 

C2ATGAATGT TT3ATGTCTC C3TTTTTTTT 16 80 

TGAAATGAGA GGTAGTGAT3 ACAGAAGGAT 1 7 40 

TOGTGGTAAT CAAG TTGGTA AC7ACTACTT 130 0 

GT 1 7 CT ' 3G AAG GCAGIAAGTG G A f-GTTTGCA 1360 

CTGAOCACTT TGCACGTACC TGGCTC CC A ( 0 13 20 

ATAACiCAO-CT TCATCTTTAC CTTG* 0 C AG A 0 1 

ATAGTTGTGT T7CCATCTGT CTOG 7ATC3T 2 04- ) 

AGAAACCAAA Gl ATGGTTTT TGTTTTCTCT 2100 

AGGG AC AAAA AAAAAAAAAA GC C G AT AT AG 216 0 

G7TCCATATC AAAGCAGATT TC-CAGGACAG 2 22 0 

ACACXTTTCCA AGGAGAGCCT TGGSCACCTG 2 2 30 

GTTCAT<2AGC ACCTGCAGAT iGCCTGACTCT 2 34 0 
GATTCACQ3T ATCX'CTCTTC GTATGCAGOG 2 4 00 

GCTAGAGTTG ATAGGAGAAA AT 1'CATTTOG 2 460 

TCAAAATGAG ^GATTTGTTA GCTTGGTACT 2S2 0 

■ oat a' 7tctg a c*c aattaact gcgttgaaca 2 580 
tgtcgattct ctotaaatgc ttattttatc 2040 
tgttctcttc g3cggggtgt ggtgactcac 2700 

G^TGGGAGGA TCA3TTAGGT C 7AGGAGTTC 2700 
CGGATCTACA AAAAAATAAA AAGCCAGACT 2 3 20 

G^AC-TGTGAG ATG0GA32AT TGTTTGAGOC 2830 
C 3CACCACTG CACTCCAGCC TGOGGAACAG 2 04 0 

TTTGG AAG AG CAAATOGGGC TGA3TGCA3T 3 000 

3 K3 



3 ;\: ; . I. . 7 I 
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{ A) LENGTH: 96 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANE-EDNESS : double 

( D ) TO PC LOGY : linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221: 
GGCACGAGOG CCGCGGGACA TCCACG3- 2GC GCGAGTGACA COO 5GG AGO 2 AGAGCAGTOT 6 0 

10 TCTGCTGGAG CCGATGCCAA AV\CCA'r>:A TTTCTTATTC AGATTCATTG TTTT CTTTT A 120 
tctgt202GC ctttttactg ctcagaga2a AAAGAAAGAG GAGAGCACOG AA2AA2TGAA 180 
AATAGAAGTT TP3CATCGTC CAGAAAA2TG CTCTAAGACA AGCAAGAAG2 GAGACCTACT 240 

15 

MAAATGCC2A TTATGACGGC TACCTOG2TA AAGACG2CTC GAAATTCTAC TGOAGCCGGA 3 00 

C AC A AA ATG A AG3CCA2 2CC AAYT2GTTTG TTCTTGGT Tr TG03CAAGTC ATAAAAGGGC 2 60 

20 TAGACATT3C TATGACAGAT ATGTGC 2CTG GAGAAAAGCG AAAAGTAGTT ATACCCCCTT 420 
CATTT2CATA CGGA^AGGAA G2CTATG2AG AAGGOAAGAT TC C AC CO 3 AT GOT AC A' V V CA 4 30 

TTTTTGA2AT T 2AACTTTAT 3CTGTGAC 2 A AA3GAG2ACG GAGCATTGA3 AC ATTT AAA.C c >4 0 

25 

AAATAGACAT G2ACAATGAC AG2CAGCT2T CTAAAGCCGA GATAAACCTC TACTTGCAAA tOO 
C5GGAATTT3A AAAAGATGAG AAGC2ACGTG ACAAGTCATA TCAGGATGCA GTTTTAGAAG ( : b0 
30 ATATTTTTAA G AA< 2AA T 1 2 A' 2 CATGATGGTG ATGGCTTCAT TTCTCCCAAG GAATACAATG 72 0 

TATAC CAA2A C 2ATGAACTA TAG2ATATTT GT ATTT C T AC TTTTTTTTTr TA2CTATTTA 780 
CTGTACTTTA TGT ATW AAA 2 AAA2TCM 2TT TTCTCCMAGT TGKATTTGCT ATTTTT 2C2C x4 0 

35 

TATGAGAACA TATTTTG AT 2 IGCCCAATAC ATT GATTTT ■ 2 2T A T AAT AAA TGTGA'3GCTG 90 0 

TTTTGCAAAC TTAAAAAAAA A TTT AAAAAA ACTGGAGG3G GGCCCGTA22 CAANTCGCCG r >6 0 

40 NATATGAT 95 8 



4o (2) INFORMATION FOR SEQ ID NO: 2 22: 

U) SEQUENCE CHARACTERISTICS: 

I A) LENGTH: 1404 base pairs 
(B) TYPE . nucleic acid 
50 (C) STRANDEDNESS : double 

( D ) TOPOLOGY : 1 i near 



55 



60 



(Xl) SEQUENCE DESCRIPTION: SEQ ID NO; 222: 

C GTTTTCCGG CCGTGCGTTT GTGGCCGTCC GGCCTCCCTG ACATGCAGCC CTCTGGACCC 60 

CGAGX2TTGGA CCCTACTGTG ACACACCTAC CATGCGGACA CTCTTCAACC TCCTCTGGCT 12 0^ 

TGCCCTGGCC TGCAGCCCTG TTC AC ACTAC CCTGTCAAAG TCAGATGCCA AAAAAGCCGC 180 



BNSDOOD <WO ^854^6 3 A2 t 
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CTCAAAGACG CTCCTGGAGA AGAGTCAGTT TTCAGATAAG CCGGT(. *CAAG ACCGGGGTTT 240 

GGTGGTGACG GACCTCAAAG CTGAGAGTGT GGTTCTTGAG CATCGCAGCT ACTGCTCGGC 3 00 

5 AAAGGCCCGG G AC AG AC ACT T TGCTGGGG A TGTACTGC-GC TATGTCACTC CATG3AACAG 360 

CCAT<3GCTAC CATGTCACCA A3GTCTTTGG GAGCAAGTTC ACACAGATCT CAC 2CGTCTG 42(1 

GCTGCAGCTG AAGAGACGTG GCCGTGAGAT GTTTGAGGT 3 ACGGGCCTCC ACGACGTG3A 48(1 

10 

CCAAGGGTGG ATGCGAGCTG TCAGGAAGCA TGCCAAGGGC- CTGCACATAG TG3CTCQ3CT 54 0 

CCTGTTTGAG GACTGGACTT ACGATGATTT CCGGAACGTC TTA 3A- :AGT3 AG3 AT GAG AT 600 

15 A3AG3AG3TG AGCAAGACCG TG3TCCAG3T GGCAAAGAAC CAGCATTTC 3 ATG32TTCGT 66" 

G 3T< 3GAG 3TC TGGAACCA3C TG3TAAGCCA '3AA3CG2GTG GGC 3TCATCC ACATGCTCAC 72 <) 

C-2ACTTG3CC GAGGCTCTGC ACCAGGCCCG GCTGCTGXTC CTCCT-3GTCA TCCCG0CTO3 78') 

20 

CATCACCCC3 G3GACCGACC AG3TGGG3AT 3TTCACGCAC AAG3A jTTTG A3CAG3TG03 84 0 

CCCCGTGCT3 GATGGTTTCA GCCTCATGAC CTACGACTA 2 TCT AC A(3CGG ATCA3CCTG3 90 0 

25 CCCTAATGCA CCCCTGTCCT G3GTTCGAGC CTGCGTCCAG GTC2T3GACC CGAAGTCCAA 96 0 

GTGGCG AAG-C AAAATCCTCC T07-OGCTCAA CTTCTATG3T ATG3ACTACG CGACCTCCAA 102 0 

GGATGCCCGT GAG3CTGTTG TCCGGGCCAG GT ACATCC A< 3 AC ACT 3AA3G ACCACAGGCC 103 0 

30 

CC'GGATGGTG TGG3ACAGCC AC-G Y CTCAG A GCACTTCTTC GAGTAGAAGA AGAGCCGCAG 114 0 

TG3GAQ3CAC GTCGTGTTGT ACCCAACCCT GAAGTCCCTC CAOGTGCGC^C TGGAGCTGG7 1200 

35 CCGGGAGCTG GGCGTTGGGG TCTCTATCTG 2X3A3CTGGGC AG3GCCTGGA CTACTTCTAC 126 0 

GACCTGCTCT A3GTGGGCAT 1'GCGGCCTCC CK2G3TGGACG TGTTCTTTTC TAAGCCATG3 132 0 

AGTGAG^GAG CAGGTGTCAA ATACAGGCCT NCAGTCCGTT TGCTGTGAAA AAAAAAAAAA 138 0 

40 

AAAAAAAAAA AAAAAAAAAA AAAA I 404 



45 

(2) TIH-*GFJ4ATION P0F- SF, 1 .; I'J HO: 223: 

( 1 ) £ EQUHMC I£ C I LARA 1 ITER I ST ICS : 

( A ) LENGTH : 7 07 base pairs 
50 (?) TYPE; nucleic acid 

(C) STRAMI'EDNESS : double 

(D) rOPOLIGY: linear 



60 
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10 



15 



ccgctgcatc gcagacgtgg t-rtoctctt 
g atc c 0 cgo c at 3gat 3a< 3 a t 3 1a gc c cga 

cctgagoo^: atgtcgg:gt cag-.tgaggt 
cgacctggag tcagcctaca acc-c 2ttcaa 
tac-ccctto: acagaagc-gg aga3c^ct2gag 
cagggggtg.; tgatccacac aa:tcac?:;t 
gtgtgagaac tttto3gcgg ggiigotcoi 

AAAAAAAAAA. AAAAAGTCRG '3GGG3GCCCG 



24 J 



::g accttcaaaa 



4L ) 



600 



rcAvrc 



25 



30 



35 



40 



45 



50 



55 



60 



(2) INFORMATION FOR SE0 10 NO: 22- 

C i ) S EQUEN C E CHARA CTE?G ST I ■: 
(A) LENGTH: 13 34 ira. 
<E) TYPE: r.-cieic a 
(o STrA-NLETNESS : a 
(C) TC'KLC'7:-: lir.es. 



EEC? TO 



(xi) SEQUENCE OESCECPTIZ:, 

G30GAA3T'3: AGT3ACAGCA GGA 3T AAG A 7 CGGGAGGCA' 

AT3GA3AG33 GGTTOAGCGA G3 37 A GAGA 3 G3-GAGACTA 

tggagggaga ggago^gaaa :a'1-ai-j>:-:- geagaagag 

a3cccgtcag oeytgttggg a3-::v.g::a eactgc-cta 

G3GGT3C3CT TOGTTCTGGT gcttctogcc 

3A3CCCGTCC TO3TGGA03G GGA 3TGC CTG 

GTCCGAAG33 AMCACCATGA G33A3CAGG3 



TACTTCGACC AGGTCCTGGT GAA3GAGGGC C<TT 3G3TTTG AC3GGG0CTC TC<3CTC<3TTC 

GTAGCCCCTG TCCG3GGTGT CTACA3CTTC CG-GTTC 2AT3 TG3T3A_AG3T GTACAACCGC 

CAAA3TGT Z Z AGGTGAGCGT GATGCTGAAC ACGTGC-CCTG TCAT'GTCAGC CTTTGC'CAAT 

GATC CTGACG TGACCCGGGA GG2AGC CAC C AGCTCTGTC-C TACTGCCCTT GGACCCTGGG 

GACC3AGT3T CTCTGCG3CT GCGTCGGGGG 3A.TCT ACT* 3G '3T*3GTTGGAA ATACTCAAGT 

TTCTCTGGCT TGCTCATCTT CCCTCTCTGA GGACCCAA3T YTTTCACA3CA CAA3AATCCA 



18 0 

2-;o 

3 00 
360 
420 
430 
54 0 
600 
660 
720 
780 
840 
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GCCCCTGACA ACTTTCTTCT 'SCCCTCTCTT 
CTCCCTCTGG YTCCTATCCC ACYTC TTT'GC 
5 AGARAARARY ARARCTGWGG CAGGTATACA 

taaccatgc:a tcytcttggt tggccacctc 
ttagtccctc cama:tctga ctgctgcctc 

10 

TCACTGTAC2 TGTTCCAGCA T ATCCC TACT 
ATTCTCCTCC TTAGrOnCC TATTACCTGG 
15 CCTGCCAGTA TGCTAAACCC TCCCTCTCT2 
CTGGATGAAT CTATCAATAA AAC AACT A 0 A 

TCCA 

OA 



477 

GCCCCAGAAA CA GO AG AGGC AGG AG AG AC A 900 

ATQ3GAMCCT GTGCCAAACA CCCAAGTTTA 960 

GAGCTGGAAG TGGACCATGG AAAAC AT S G A 102 0 

CTGAAACTGT CCACCTTTGA AGTTTGAACT 10&0 

CTTCCTCCCA GCTCTCTCAC TGAGTTATYT 1140 

ATC TCTCTTT CTCCTGATGT GTGCTGTCTT 120 0 

GATTCCATGA TTCATTCCTT CAGACCCTOT 12 60 

TTTCTTATCC C3CTGTCCCA TTGGCCCAGC 13 2 0 

GAATGGTGGT C AAAAAAAAA AAAAAAAAAC 13 30 

13 84 



25 



(2) I MFC '-RMAT I ON FOR SEQ ID NO: 22 5: 



35 



( ^ ) SEQUENCE CHARACTERISTICS : 

<A) LENGTH: 7 50 base pairs 
{&) TYPE: nucleic acid 
(2) STRATJDEDNESS : double 
30 iD) TOPOECOY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 225: 

GGGTCGACCC ACG2GTCCGC TGACCAGTCG GTT AT AG AT A CTTCTTCCTA TACCAAAACT 

GTTTAAAGAG GT3CCACCAC AAGGGATGTG GTCCTTACTC TCTGCGG3TC TTCAAGCATC 

gctttgtggg aaargtctct gg:-caagca-c gtggtatttg gtctgctgct tgcttccctt 

40 TTTCCACCAG GGATGTTGTG AT 2 AT AAGT C AAAAC AAC AG TATATTCCAA AT C TC AAAAG 
ctattgtggc ctgagcacaa TTCAAATCTA C^AGAGTTTT TCCTATGTAG CTTTAGAGTA 
ACTCTTCTGC TTCTCTGTCA CTTACAATTC AGGTTCTGC C TTTGC C T AAG AGCATGAGCA 3 
GAAGAGTCCT CATGTGACGC TTAGTTCTAT TGCAGTCCTG GG TGAAAC T A TTTAAGCWAT 4 
GGGGCTGCTK CTCCCCANWT CCTCCCTAAC AATTCGTTGT GTGGACTTCT CATCTAAAAG 
50 GTTAGTGGCT TTTGCTTGGG ATCAGTGCTC TCTATTGATG TTCTTGCTGG TC TC C AG ACA 
CATTCCTGTT GCATTAAGAC TTG AAAGAC T TGTAGATGTG TCATGTTCAG GC AC AGG ATG 



120 
ISO 
240 
300 



540 

600 



00 
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478 



10 



20 



30 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 2 26: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH . 2057 base pairs 

(B) TYPE- nucleic acid 

(C) STRANDEDNESS : double 

( D ) TO POLOGY : linear 



GAGATAAAGG ATATC CGGTT GGTGGGGATC CACCAAAATG GAGGCTTC AC CAAGGTGTGG 



45 AGGATCAC OA TGATGTCCCG ACCOCCAGTG CTTCTGGAAA AAGTCATCTT TGCCCTTGGG 



60 
120 
180 
240 

300 
3 6o 



■:xi) SEQUENCE DESCRIPTION: SEQ ID NO: 226: 
CCGAGCCGGC TGCGCCGG3G GAATCCGTGC GGGCGCCTTC CGTCCCRGTC CC ATCCTCGC 
15 C3CGCTCCAG CAC0TCTGAA G IT TTGCAGtC GCC-2AGAAAG GAGGCGAGGA AGGAGGGAGT 
GTGTGAGAG3 AGG 3AGC AAA AAGCTCA0CC TAAAACATTT ATTTCAAGGA GAAAAGAAAA 
A(3G3GG<30CG CAAAAATGGG T3GGG3AATT ATAGAAAACA TGA(3CACCAA GAAGCTGTGC 
ATTGTT^TG -3GATTCT3-:T CGTGTTCCAA ATCATCGCCT TTCTGGTG3G AGGCTT3ATT 
GCTCCAXT-GC CCACAA03GC AGTGTCCTAC ATGTCGGTGA AATGTGTG3A TGCCCGTAAG 
aaccat:aoa AGACAAAATG GTTCGTGCCT TG03GACCCA ATCATTGTGA caagatccga 4 2 

GACATTGAAG A-3GCAATTC 2 AA&3GAAATT GAA3CCAATG ACATCGTGTT TTC TGTTC AC 4 8 

ATT SCO :TC 2 CCCACAT3GA GAT3AGTCCT TGGTTC CAAT TCATGMTGTT TATCCTGCAG 
CTGGACATTG C2TTCAA3CT AAA - 2.AACC AA AT C AG RGAAA ATGCAGAAGT CTC0ATO3AC 
GTTTCCCTGG CTTACCGTGA TGAC 3CGTTT GCTGAGTGGA C T G AAATO 30 CCATGAAAGA 660 
0O GTACCACGGA AACTCAAAT< ; CAC OTTCACA TCTCCCAAGA CTCCAGA 32A TGGAGGGCCO 72' = 

GTTACTATGA ATGTGATGTC CTTCCTTTCA TGGAAATT3G GTCT3TG3CC CAT< 3AAGTTT 



54 0 
6 0<; 



730 



TACCTTTTAA ACATCCGGCT GCCTGTGAAT GAGAAGAAGA AAATCAATGT G3GAATTX3G3 S4!' 



900 



TTTGCCATGA AGACCTTCCT TAC3CCCAGC ATCTTCATCA TTATGGTGTG GTATTGGAG3 960 



102 0 



ATTTCCATGA CCTTTATCAA TATCCCAGTG GAATGGTTTT CCATCGGGTT TGACTGGACC 108 0 



114C 



TGGATGCT3C TGTTTGGTGA CATCCGACAG GCATCTTCTA TGCRATGCTT CTKTCCTTCT 

GGATCATCTT CTGTGGCGAG CACATGATGG ATCAGCACGA GCGGAACCAC ATCGCAGGGT 1200 

ATTGGAAGCA AGTCGGACCC ATTGCCGTTG GTCCTTCTGC CTCTTCATAT TTGACATGTG 1260 

55 TGAGAGAGGG GTACAACTCA CGAATCCCTT CTACAGTATC TGGACTACAG AC ATTGGG AA 132 0 

CAGAGCTGGC CATGGCTTTC ATCATCGTGG CTGGAATCTG CCTCTGCCTC TAACTTCCTG 13 8 0 

TTTCTATGCT TCATGGTATT TCAGGTGTTT CGGAACATCA GTGGGAAGCA GTCCAGCCTG 1440 

60 



RN ShOC IP < WP ^854963 A 2 
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0 GTA 
CA7 

10 

atg 

1 5 atg 



T" AATTTTT AG GTTCAAGTTC 1500 

7CTTCTTCAT GGTTAGTCAG 156 0 

GCAAGTGAAC AGTGCCTTTT 162 0 

— . — r^ 1 j\ r TGTT"G TTGTATGCAG 1630 

CCAAGTCCCA TGTAAATCGA 17 4 0 

ATTGTTCAGC GCTTCGAAAT 1800 

GAAG AAGGCA AGAGATGTPT 18 6 0 

GAGTTGTATA C'GC'ACACAAA 192 0 

ga^;'Aa;^: g tc aac aat aa i 9 8 o 

aaaaaaactc gtgccgaatt 2 04 0 

2(357 



25 



GG^GGGH: 20c 



30 



5g> 



40 CGGGGGTAGG GGACA~ 



45 



50 



TGTG2 



- A GA GC G 1 A. AGG2GGA7GG G0 2TGTGCGG GTGGTGTGGG 

-G^'GTTG* - ^ AGGGOAAATA ACTGTTCATT TTTCACTCCT 

-GAAAAAGA ATGGGCAGGG TGGAAACCAG AAGAAAAATA 

2G~GTGGGT SCGG-O'GG^GTG GCTGAGTGTG TGGAGTCCTG 

3CAGCGAGT GTG7GOGAGG GAGAAGGTAG GO AGGC AGGT 
GCCTGGGGT GCCCCCTCCG G C ; 0TGGGGC ! 7 GTGTTGCTGC 
_ : ^ ATG-7G-GGG GO CCGGCCCTTC ACTTGG ATGC 



60 
120 
180 
240 



4 20 
480 
54 0 

£ r > ■') 
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CTCCTCCCAT TGGACTGTOG GGTGCCTGAT AACCTGAGTA TGGCTGACCC CAACATTCGC 84 0 

TTCCTOGATA AACT^GCCCCA G3AGA0COGT GACOGTGOTG GGATGAA^GA TCGGGTTTAC 90 0 

5 

ago aac at . xt a tctatg aggt tctggagaac g< 3gcao 0< go 5 o so jo acc'h j tgt j 1 :tgg ag c ' 6 0 

tacgccacco octtgcagac tttgtttgcc atgtgagaat acagtcaagg toogtttago 1020 

10 ggogagjata ggcttgagga gjccaaactc ttct 1 )cc' jga cacttga03a catcctojca 103 0 

GATGCCCCTG AGTCTCAGAA CAACTGCC<3C CTCATTGCCT ACCA03AA2C TGGAGATGAC 1140 

agcagottgt c^ctctcoca ogaggttctg coggagctg: gjgaggaoga a-\ajgaagag i;:oo 

15 

gtta<3tgto; oaajgttga^ gacjtgagog gt-goccagta cgtccaggat gtjogaagag 12 60 

CCTGAGCTOJ 2 1 JATCAGTGG AAT 1 3G AAAAG CCGCTCCCTG Tjj^JAj^ TTTCTCTTGA 13 20 

20 gacg'GAojgt :accaggcca gagjctcgag tggtgtgca\ gcctotggac TGJOGGCTCT 13 80 

cttcagtogc t< gaatgtcca o gaga got at ttccttcga z a jo go go gtt 0:agggaa3g 144 0 

gtcgaggaot tgagatgtta agatojgtct tgtgggcttj gjjgagtgat tt jcgctctg 15*00 

25 

T3AGCCTCGG TGTCTTCA\C CTGTGAAATG GGATCATAAT CACTGCCTTA CCTJrCTCAC 15 6 0 

GGTTGTTGTG AOGACTGAGT GTGTGGAAGT TTTTCATAAA. CTTT^SATGC TAGTGTACTT 16 2 0 

30 AGGGGGTGTO C GAGGT 3TCT 7TCATG3GGC CTTCCAGAC 2 CACTC CC 2 AC CCTTCTCCCC lb 30 

T/TCCTTTG2C CGGOGACGCC GAACTCTCTC AATG3TATCA ACAOGCTCCT T G0O2 2TCTG 1*740 

G2TCCTGGTC ATGTTCCATT ATTGG3GAGC CCCAGCAGA^ GAATGGAGAG GA i 0* GAGGA GG 180 0 

35 

CTGAGTTTOJ GGTA rTGAAT CCCCCOGCT3 CCAG 3CTGCA G Z AT r Z AA GGT T02TATGGAC 1360 

TCTCCTGCCG GGCAACTCTT G2GTAATCAT GACTATCTCT AGGATTCTOG CACCACTTCC 192 0 

40 TTCCCTGGCG CCTTAAGCCT AGCTGTGTAT CGGCACCCCC ACCCCACTAG AGTACTCCCT 193 0 

CTCACTTGCG GTTTC CTT AT AGTCCACCCC TTTCTCAAGG GTCCTTTTTT AAAGCACATC 204 0 

TCAGATTAAA AAAAAAAAAA AAAAAAAAAA AGGGGGGGGN GCNT 2 084 

45 



50 



(2) INFORMATION FOR SEQ ID NO: 228: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2143 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
55 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 228: 

TCGACCCACG CGTCCGGTTG AATTCCTTGA CCTGCAAACA CATATTTATT AGCCTGACTC 60 

60 
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AAACAATGAA GCTATTAAAA CTTCGGAGGA AC ATTGT AAA ACTCTCTTTG TATCGGCA1 
TCACCAACAC C^TTATTTTG GCAGT-^GCAG CATCC ATTGT GTTTATCATC TGGACAACC 



TGAAGTTCAG AATAGTGACA TGTCAGTCGG ACTGG2GGGA G7TGTGC 



TCTGGCGCTT 3CTGTTCTCC ATGATCCTCT TTGTCATCAT G3TTCTCT-G CGACCATCTG 
CAAACAACCA GAG3TTTGCC TTTTCACCAT TGTCT': "AGGA AGAGGAG3AG GATGAACAAA 



TCGTTCTTTA GGATO3ACTG TTCTGGTATC TG< jTATTGGT TTAGAGACTG TTAATAAGCG 



120 
189 



;TA GACGATGCCA 249 



300 
360 



AGGAGC CT AT GCTGAAAGAA AGCTTTGAAG GAATO AAAAT GAGAAGTA2C AAACAAGAAC 420 



430 
940 
6C0 



CCAATGGAAA TA3TAAAGTT AACAAAGCA 2 AGGAAGATGA TTT G AAGTGG GTAGAAGAGA 
15 ATGTTCCTTC TTGTGTGACA GATGTAGCA2 TrCCAGCCCT TCTGGATTIIA GATGAGGAAC 
G AA1 '< GATC AC A 7 AOrTTGAA AGGTCCAAAA TGGAGTAAGG AATGCGAAGA TTTGCAGTTA 
, ^, rATrArr n-Tr^-:Tr;Tr^ CTCTTCTGTA 03GCTCCAT3 660 

20 

GGATTAAAGG AAGCAAT7,AC ATCCTGATCT GTTCCTTGAT CTTTO GCK" AT TGGAGTT3G7 
GAGAGGTGTC AGAACAAAGA GAACATCTTA CTGAAAACAA GTTCAT AA 3 A TGAGAAAAAT 

25 ctac:ga(3Ctt cttatttaca acactgctgc cccctttcct CCCAGACTCT GACATGGATG 

TTCATG2AAC TTAAGTGTGT TGTTCCTGAA CTTTCTGTAA TGTTTCATTT r A TOATCT3 
ACAAACTAAA AAGTTTAACG TCTTCTAAAA GATT-.1TCATC AACACCATAA TATGTAATCT 

30 

CCACJGAGCAA CT3CCTGTAA TTTTTATTTA TTTAGGGAGT TACATAGGTG ATGGGGGAAA 
TTGTTAACTA CCTTTCATTT TCCTGGGAAG TCAAGGTTAC ATCTTGCAGA C^TTGTTTTG 
35 AGAAAAAAGG G7CCTTCTGA GTTAAQ3AGC CATAGTTCTA TCAATGATCA AAAC^AAAA 1M0 
AAAAAAAAGA ( 3 AAAC T GTT A CAGTATGATT CAGATCATTT AAAAAAG7AA AATCAAGT(.;C 
AATTTTGTTT AGAAAT^GGTG TATATTAAAG ATTTTTCTAT TTCAGATGTA CTTT AAAG AG 
AAATATTAGC TTAACT=rTTT TGACATCTGC TATT*3T*GACA CATCC CAT TG <7TGGCAATGT 
GGTGCACACT CCG AAACTTT TAACTACTGT TTTGTAAGCC " ?C Z A^GCa >TG GCATTCCA3G 
45 GTCCTTAGGC AATGTTTTCT TTGC CTTT AT GCA ^AGAGCT G2TC7AA HV, 2TGTGA IT "1A 
GCACCGTGCT AGAGGAACTG TAAT3CTTCA GAA- 3TTGT AG CTT AT AC AAA GGAAACAG3T 
CCTGCTGGCT TAATTTAAAC AGTTATTGCA TGAAGTA3CG TGGAGGCrCT GGACTGCT3C 1560 



710 
780 
840 
900 
96 0 
1020 
10 3 0 



1200 
.1260 
1220 
IjPO 
1 4 A 0 
1I;00 



1620 



1 '-CP 



60 



WO 98/54963 



PC T/i;S98/J1422 



4X2 

TGTTAAAACC AATGACCACA TGACCACAAT CTTCACTAAC TC AT AC TC-C ?\ GTGAAAGTGT 192 0 

TAACCCTTAG GTAGTTTCTC TACAACTCTT TGCTATGGTG ATTTIT A^ AA AAG1TTCCTA 1930 

5 GGGAAGTATC TCTGAO 9G AA. CACtGCAATCT G AAG jAACTG ACTATATTCT CCATGGCTAA 2 04 0 

GTCCATTAGG CCAAAAGNCT GG IJ TGGGT AT TGGTTGT 2 AN GCTOTCTATT GGCATATTAA 2100 

AAACGTAGGC CGGANGGAAT AATTAGGTTG THAT-SCCGGC GGG 214 3 



15 



(2) INFORMATION F'R SEC. ID NO: 22 9: 



(it GEC/.JE^CE CHARACTERISTICS: 

(A) LENGTH: 1025 base pairs 
IB) TYPE; nucleic acid 
C C ) STRAI JDEDNES 3 : double 
20 (D) TOWLCGY : linear 

(xii S E 2 T JEJ JCE DESC F.IPTI ON : GEO- IE- NO : 229: 

CCTGCCCCAC ATTGCTTCAT TfXJCCTGGCC ATGCGCCTOT ACT ATGGC AG CCGCTAGTCC 6 0 

25 

CTGACAACTT CCACCCTGAT TCCGGACCCT GTAGATTGGG CGCCACCACC AGATCCCCCT 120 

CCCAGGCCTT CCTCCCTCTC C CATC AGO AG CCCTGTAACA AGTGCCTTGT GAGAAAAGCT 18 0 

30 GGAGAAGTGA GGGCAGCCAG GTTATTCTCT GGAGGTTGGT GGATGAAGGG GTACCCTA3G 24 0 

AGATGT'GAAG TGT ( .VGGTTTG GTT AAG GAAA T02TTACCAT CCCCCACCCC CAACCAAGTT 3 00 

CTTC CAGACT AAAGAATTAA GGTAACATCA ATACCTAGGC 2TGAGAAATA ACO rCATCCT 3 60 

35 

TGTTGGGCAG CTCCCTGCTT TGTCCTGCAT GAACAGAGTT 2ATGAAA 7PS GGGT3TGGGC 42 0 

AACAAGTGGC TTTCCTTGCC TAOTTTACTC ACCCAGCAGA 3CCACTGGAS CTGOSTAGTC 4 30 

40 CAGCCCAGCC ATGGTGOATG ACTCTTCCAT AAGGGATCCT CACCCTTCCA CTTTCATGCA 540 

AGAAGGCCCA GTTGCCACAG ATTATACAAC 'CAT/T AC CCAA ACCACTCTGA CAGT2TCCTC 600 

CAGTTCCAGC AATGCCTAGA GACATGCTCC CTGCCCTCTC CACAGTG2TG CT2CCCACAC 660 

45 

CTAGCCTTTG TTCTGGAAAC C C C AG AGAGG GCTGGGdTG ACTCATCTCA GG 3 AATGTAG 72 0 

CCCCTGG02C C TGGC T F AAG CCGACACTCC TGACCTCTCT GTTCACC2TG AGO GC TGTCT 730 

50 TGAAOCCOGC TACCCACTCT GAGCCTCCTA GGAG2TACCA TGCTTCCCAC TCTGGGGCCT 840 

GCCCCTGCCT AGCAGTCTCC CA3CTCCCAA CAG2CTGOGG AA3CTCTGCA CAGAGTGACC 900 

TGAGACCAGG TACAGGAAAC CTGTAGCTCA ATCAGTGTCT CTTTAACTGC ATAAGCAATA 960 

55 

AGATCTTAAT AAAGTCTTCT AGGCTGTAGG GTGGTTCCTA CAACCACAGC CAAAAAAAAA - 1Q2 0 

AAAAA 1025 

60 
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(2) INFORMATION FOP GEO ID NO : 2 30: 

( i ) SE-;»UET JC E C HARA' 7TER I ST ICS : 

(A) LENGTH: 12 SO Dase pairs 
(E) TYPE: nucleic acid 
(C) 3TFAMDEDNESS : double 
( O ) TO POLCG '1 : 1 i near 



(xi) GE3O3EN0E DESCRIPTION: 3EQ ID NO: 2 3 0: 
GCCCACGCGT COrXTCCAOJC GT:: :GGCGGT GCGGAGTAT3 GGGCGCTGAT GGCCATGGAG 60 

15 gsotactggc G2TTCOT03C goy-oct^ggg tc<3gcactoo tcgtcggctt cctgtcggtg 

AT5TTCGC2C TC3TCTGG3T CCTCCACTAC CGAGAGGG3C TTGGCTGGGA TGGGAGCGCA 
rTAGAGTT^A ACTGGCACGC AGTGCTSATG GTCACCGGGT TCGTCTTOAT CCAGGGCATC 

20 

GCATCATC GT C TACAGA2TG CC3TGGACCT G3AAATGCAG CAAG3TCCTG ATGAAATCCA 
TCCATGCAGG G1TAAATGCA GTTGCTG3CA TTCTTGCAAT TATCTCTGTG GT03CCGTGT 3 GO 

25 TTGAGAACCA CAATG1TAAC AATATAGCCA AT ATGT AC AG TCTGC ACAGC TG G 3TTGG AC 420 
TOATAOCTGT C AT ATGC T AT TTGTTACAGO TTCTTTCACG TTTTTCAGTO TTT 3TGCTTC 4 80 

CATG3GCTCC GCTTTCTCTC CGACOAITTC TCATGCCCAT AC ATGTTT AT TCTGGAATTG 54 0 



12 0 
180 

2 4 0 

3 00 



600 



TCAT3TTTOG AACAGTGATT CGAACAGCAC TTATGGGATT GACAGAGAAA CTGATTTTTT 
CCCTGAGAGA TCCTGCATAC AGTACATTCC CGCCAGAA(3G TGTTTTCGTA AATACGCTTG 
35 GGCTTCTGAT 0:TOGTGTT2 G3GGO :CTCA TTTTTTGCAT AGTCACCAGA CCCX'AATGGA 
AACGTCCTAA G3AGCCAAAT TCTACGATTC TT C AT C O AAA TGGAG 3CACT GAAGAGGG AG 
CAAGAGGTTC CATGCCAGCC TACTOTGGCA ACAACAT3GA CAAATCAGAT TC .' AGAGTT AA 84 0 

ACARTGAAGT A3GAGCAA<:3G AAAAG AAACT TAGCTCT3GA TGAGGCTGGG CAGAGATCTA 
OOATGTAAAA IGTTOTAGAG ATAGASCCAT ATAACGTCAC GTTTC AAAAC TAGCTOTACA 
45 OTTTTGCTTC TOOTATTAGC OATATGATAA TTGGGOTATG TAGTATCAAT ATTTA 2 TTT A 
ATCACAAA 3G ATGGTTTCTT ■ 3 A-\AT AATTT ■ 3T ATT 3 ATTG AGGCCTATGA ACD3ACCTGA 
ATTGGAAAGG ATGT 3 ATT AA TATAAATAAT A3CAGATATA AATTGTGGTT ATGTT ACCTT 114 0 

TATCTTGTTG A< 3G AO C AC AA CATTAG2ACG GTGCCTTGTG CAJKAATAGAT A CT CAATATG 12C0 



720 
7 30 



900 
Ci 6 0 



10R0 



60 i i SECCTNCE CrlARACTER Z .OF ICS: 
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(A) LENGTH: 1311 base pairs 

(B) TYPE : nucleic acid 

( C ) ;T RANDE Dr £ 3 : do ub 1 o 
( D ) TOPOLOGY : 1 inea r 

5 

SEQUENCE DESCRIPTION: SEQ ID NO: 23 1: 
CIIGNCAGTAC CC3GTCNGATT CCCGGGTCGA CC 2ACGCGTC C COTGCATTC C^GGOCCTTT 60 

10 CAGTGGCTTT CATTCTGAAG TTCCTGGATA ACATGTTCCA TGTCTTGATG GC Z ■ 2AGGTT A 12) 

CCASTGTCAT T ATCAC AA' 3 A GTGTCTGTCC TGGTCTTTGA CTTCA03CCC TCCCTGOLAAT 180 

TTTTCTTGGA AGCCSCATCA GTC5TYCTCT CTATATTTAT TTATAATGCC A2CAAGCCTC 24 3 

15 

AAGTTCOCGA ATACGCACCT AGGCAAGAAA GGATCCGAGA TCTAAGTGGC A^TCTTTGGO 30 3 

A'XTGTTCCAG TGOCGATG3A 3AAGAACTAG AAA* IACTT AC CAAACC 2AAG A CTGATGAGT 36 3 

20 CAGATGAAGA TACTTTCTAA CTGGT ACCCA CATAGTTTGC AGCTCTCTTG AACGTTATTT 42 3 

T'CAC ATTTT C AGTGTTTGTA AT ATTT AT C T TTTCACTTTG ATAAACCAGA AATGTTTCTA 4 30 

AAT2CTAATA TTCTT'TGC AT ATATCTAGCT ACT2CCTAAA TGGTTC CATC CAAGGCTTAG 540 

25 

A' ~^T AC C CAAA OG 2 T AAGAAA TTCTAAAGAA CTGATACAG3 AGTAACAATA TGAAGAATTC 600 

ATT AAT ATCT CAGTACTTGA T AAATC AC- .AA AGTTATATGT GCAGATTATT TTCCTTGGCC 660 

30 TTCAAGCTTC CAAAAAACTT GT AAT AAT C A TGTTAG2TAT AG:TTGTATA TACACATAGA 72 0 

GAT C AATTTG CCAAATATTC ACAATCATGT AGTTCTAGTT TACATGCCAA AGTCTTCCCT 760 
TTTTAACATT AT AAAA GC T A GGTTGTCTCT TGAAlTTTTGA CO 2 C C T AG AG ATAGTCATTT 84 0 

35 

T<3CAAGTAAA GAGCAACGGG ACCCTTTCTA AAAACGTTGG TTGAAGGACC TAAA.TACCTG 9 CO 

GCCATACCAT AGATTTGCtGA T2ATGTAGTC TGT C a2 T AAAT ATTTTGCTGA AGAAGCAGTT 960 

40 TCTCAGACAC AACATCTCAG AATTTTAATT TTTAGAAATT CATGGGAAAT TOGATTTTTG 102 0 

TAATAATCTT TTGATGTTTT AAACATTGGT TCCCTAGTCA OCATAGTTAC CACTTGTATT 1080 

TTAAGTCATT TAAACAAGCC ACGGTGGGGC TTTTTTCTCC TCAGTTTGAG GAGAAAAATC 114 0 

45 

TTGATGTCAT TACTCCTGAA TTATTACATT TD3GAGAATA AGAGGGCATT TTATTTTATT 120 0 

7vGTTACTAAT TCAAG2TGTG ACTATTGTAT AT C TTTCC AA CIAGTTGAAAT GCTGGCTTCA 12 60 

50 C IAATCATACC AGATTG TO AG TGAAGCT 1 3AT G2CTAGGAAC TTTTAAAGGG ATCCTTTCAA 1320 

AAGG ATCAC T TAGCAAACAC ATGTTGAOTT TTAACTGATG TATGAATATT AATACTCTAA 1380 

AAATAGAAAG ACCAGTAATA T AT AAG TC AC TTTACAGTGC TACTTCA<2AC TTAAAAGTGC 144 0 

55 

ATGGTATTTT TCATGGTATT TTQ2ATG 2 AG CCAGTTAACT ■2TCGTAGATA -'GAG AAGTC AG . 1500. . 

GTGATAGATG ATATTAAAAA TTAGCAA^CA AAA* 3TG AC TT 3CTCAGGGTC ATGCAGX2TGG 1560 

60 GTGATGATAG AAGAGTGOGC TTTAACT3GC AGGCCTGTAT GTTT AC AG AC TACCATACTG 162 0 



WO 98/54963 PCT/US98/11422 



485 



10 



TAAATATGAG CTTTATGGTG TCATTCTCAG AAACTTATAC ATTTC T (YJTC TCCTTTCTCC 
TAAGTTTCAT GCAGATGAAT ATMOT.AAT ATACTATTAT ATAJVTTCATT TOTGATATCC 
A2AATAATAT G ACT ~^GC AAG AATTG jTGC A AATTTGTAAT T A AAA F AA 1 * r ATTAAACCTA 
AAAAAAAAAN N 

(2) INFORMATION FOR SF.Q ID NO: 2 32: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 227 L base pairs 
(3) TYPE: nucleic acid 
( C ) STRANDEDMES 3 : doubl e 
( D) TO£OL r 0GY : 1 i r'f=>ar 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 232: 
CTGACCTCAT GGCGTAGAGC CTA5:AACAG CGCAGGCTCC CAGCCGAGTC CGTTATCGCC 
25 GCTGCCGTCC CGAAGAGGAT GAGGSGGGCA GCACAAGCGA AACTGCTCCC C3GGTCGGCC 
ATCCAAGCCC TTGTGGGGTT G3CG:GG2CG CTGGTCTTGG CGCTCCTGCT TGTGTCCG3C 
GCTCTATCCA GTGTTGTATC ACGGACTCAT TCACCGAGCC CAACCGTACT CAACTCACAT 
30 ATTTCTAC 2C CAAATGTGAA TCCTTTAACA CATGAAAACG AAACCAAACC TTCTATTTCC 
CAAATCAGCA CCACCCTCCC TCCCACGACG AGTACCAAGA AAAGTGGAGG AG2ATCTGTG 
35 GTCCCTCATC CCTCGCCTAC TCC7CTGTCT CAAGAGGAAG CTGATAACAA TGAAGATCCT 
AGTATAGAGG AGGAGGATCT TCTCMIGCTC AAC AGTTCTC CATCCACACC CAAACACACT 
CTAGACAATG GCGATTATG3 AGAACCAGAC TATGACTGGA CCACGGGCCC C AGGG AC G AC 
GACGAGTCTG ATNGACACCT TGGAAGAAAA CAGGGGTTAC ATGGAAATTG AACAGTCAGT 
GAAATCTTTT AAGATGCCAT CCT ;AAATAT AGAAGAGGAA GACAGCCATT TCTm.'.rt 
45 TCTTA'ITATT TTTGCTTTTT C-CArTGCIGT TS'lTTACATT ACATAT CACA ACAAAACKAA 
GATTTTTCTT CT3GTTCAAA GCA^AAATG GCGTGATGGC CTTTGTTCCA AAACAG03GA 
ATACCATCGC CTAGATCAGA ATGTT AATG A GGCAMGTCT TCTTTGAAGA 1TACCAATGA 
TTATATTTTT TAAAGCACTG TCATTTGAAT TTGCTTAT'JT AATlTTATTr GCTTGACTTT 



40 



50 



1530 
17-10 
1800 
1311 



60 
120 
180 
240 

300 

3 60 
420 
480 
540 
GOO 
660 

730 
340 
900 



' ; A GGTO AGTC 960 
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T'GCTACTTTT AAAAGATCCC AAA CTT G T AA CTAAATTCTG ACATATCTGT TACTGCTGAC 1200 

TCACATTCAT TCTOCGCCAT TCAAATACTA TTTTTTATCC A7ATTTTTTT TTGTTCCCAA 12 6<> 

ACTGTAATGT ACAAGGATAT GTGTGATAAT (JCTTTGGATT TGAGTAATAT TTTTTTTTCT 13 2 0 

TCCAAGAAAA CTO 7T TTGGA TATTTTT AG A 1 1 AATTT AAA C ATAATTTAGG ATAATGATAT 13 80 

TGCTCAATCT GAOGACAATT TT AGGT AAAA GATTAAATGT GTCAAGAAAT CTTGGCAACA 144'.) 

GAGACTCT TC AGCTTGCAGT GGACAT AG AT AAAATGTTAC AGAGATACTA TTTTTTTGGT 1500 

T3GAATTACT ATATTAAATT TAG AAGC AG A AACTGGTAAA ATGTTAAATA CATGTACAAT 156 0 

TGCTTTTAGT TAGCAATTGA TTGTAGCATG ' 3GTTCCTCCA AGGTTTCAAG CAATGGGCAG 162 0 

AGTTTAAAAT T AT AT C AG AT TCGTTTACTT rGTTTATTAT TTTACAGTAA ATTTGA^TAA I6h0 

A TC TT AGO GG TCATTATCAC TTAAATAATA CTGTACCTAG GTCTTT'TAAA TT AAAATT AT 174 0 

ACCTGAAT3A AGTTGTTTGT AT ACA TAAACj TATATTTGTG TACAATTACC TTTTTT 7CCC 1300 

CACACTTGTT TTCTTTGTTT TTG TTTTTT A T03CAACTGG AAAGTATTTA CTA'TGGGATT 1860 

CATTTATGTC TGTGTTT 2TA T 1 2ATAAAG AA TTGATCAATA TGTAAATATG TG ATTTGAAC 192 0 

CATGGTTGAC TTACAAGTGT CACTACAGCT PTTTAGAAAA CATAGCCCTA ATATATGTTA 198 0 

AGCAGGACCC GGGTCAG7CA GTGGGCTTGC GG TTTATGT A GAGCTGGAAG AAGG2CGTCC 2 040 

A/IC C TG TC T C TTGGCCGGAC AGTGTACTTT GCTAATAGGG AAGGGAAGCA CAATGGAAAT 2100 

ACCCCTGAAC C GTTTT ATT G CAGTAATTTT TTTCATATC;T GAAACTATTA TTTAATATTT 2160 

35 TGAATAAGAT TTTAAAAAAT AAATGGCAAA GATATAAATC TAAAAAAAAA AAAAAAAAAA 2 22 0 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAANANA N 2271 



30 



40 



(2) INFORMATION FOR SEQ ID NO : 23 3; 



(i) SEQUENCE CHARACTERISTICS; 
45 (A) LENGTH: 133 8 base pairs 

(P) TYPE - nucleic acid 

( C ) STRAI JDEDNES S : doub 1 e 

(D) TOPOLOGY: linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO; 233: 

CTTCCGGTTC TCCGGGCAGC TGCCACTGCT GTAGCTTCTG CCACCTGCCA CGACCGGGCC 60 

TCTCCCTGGC GTTT03TCAC CTCTGCTTCA TTCTCCACCG CGCCTATGGT CCCTCTTGGA 12 0 

55 

GCCAGCGTGG CGNGGCTGGC GGC TCCO0O0 TGGTGAGAGA GCGGTCCGGG AA2 r 3LATGAAG 180_ 

GXTCTCGCAGT iXTTGCTGCTG TCTCAGCCAC CTCTTGGCTT CCGTCCTCCT CCTGCTGTTG 240 

60 CTGCCTGAA 7 TAAGCGGGYC CCTGGMAGTC CTGCTGCAGG '2AGCCGAGGC CGCGCCAGGT 3 00 



3NSDOCID -.WO 98549C.3A2 
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35 



40 



YTTGGGCCTC CTOACCCTAG ACCAGGACAT TACCGCCGCT GCCACCGGOC CCTOACCCCT 
GCCCA-3CAGC CQCG2CGTGG TCTGGCTGAA GCT^GGGGG CCGCGGGGCT CCGAGGGAGG 



(2) INFORMATION FOR SEQ ID NO: 234: 



■ i ) S EQUt 2 :C E CI IARACTER I ST I ■ 3 3 : 

(A LENGTH : 31 ammo acids 
45 (Hi TYPE: amine acid 

(I") ■ TOPOLOGY : i inear 
(xi.) SEQUENCE DESCRIPTION: SEQ ID NO: 234: 

Met Leu Ser Thr Gly He Glu Val Ala Arg Pro Pro Ala Thr Leu Leu 
50 1 5 10 1S 

Gly Leu Met Pne Val Leu Thr G Ly M>_G Pro Arg Gly Leu An Xaa 



42 0 



CAATG3-CAGC AACC3TGT3G CCG3GGTTGA OACG3ACGAT CAG3v,AGGGA A3GOCGG3GA 4 30 



540 
600 
660 



ARGCTCGGTG GGTG3CGG3 C TTGCT 3T3AG CCCCAACCCT G303ACAAGG C C ATGAC CCA 
0 GCGGOCCCTG ACCGTGTT3A TGGTGGTGAG CG3CGOG3TG CTG3TGTACT TCGTGGTCAG 

GACGGTCAG3 atgagaacaa gaaaccgaaa gactaggaga TAT3GAGTTT TGG AC a ct aa 

CATAGAAAAT AT3GAATTGA CACCTTTAGA ACAGGATGAT GAGGATGATG ACAACACGTT 
GTTTGATGCC AATCATCCTC G A AG AT AAG A ATGTGCCTTT TG AT 3 AAAG A ACTTTATCTT 
T CT AC AATG A ACAGTG3AAT TTCTATGTTT AAGGAATAAG AAGGCACTAT ATCAATGTTG 
20 G^'a.* y/rATT T AAGTT AC AT ATATTTI3AAC AACCTTTAAT TTGOTGTTGC AATA>^vTACC 
GT ATGC TT 1 T ATTATATCTT TATATGTATA GAAGTACTCT GTTAATGGGG TCAGAGATGT 
TGGGGATAAA GTATACTGTA ATAATTTATC T3TTTGAAAA TTACTATAAA ACGGTGTTTT 

25 

CTGRTCGGTT TTTGTTT TCT GCTTAC CATA T3ATTGTAAA TTGTTTTATG TATTAATCAG 
TTAAT3CTAA TTATTTTD3C T<3AT3T2ATA TCTTAAAGAG CTATAAATTC 'CAACAACCAA 114 0 

30 CTGGT3TGTA AAAATAATTT AAAATY TC C T TTACTGAAAG GTATTTCCCA TTTTTGTGGG 1200 
GAAAAGAA3G CAAATTTATT ACTTTGTGTT C^CGTTTTTA AAATATTAAG AAATGTCTAA 1260 
CTTATTGTTT GCAAAA 3 AAT AAATATGATT TTAAATTCTC TTAAAAAAAA AAAAAAAAAC 132 0 

CCCGGGGGGG GGCCCG3N 



7 80 
84 0 
S* 0 G 
960 
1 0 2 0 
1080 



1333 
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{ B ) TYPE: amino acid 
( D ) TOPOLOGY : 1 inoar 
(xi) S EQUENCE L E SC E I PT I ON : SEQ ID NO : 235: 

Met Asn Val Val lie Val lie lie Leu Phe Ser Phe Asp Ser Val Gly 

15 10 15 

Thr Mot Phe Ser Cys Asn Arg lie Pro Lys lie Thr Val Leu Asn Lys 
20 25 30 

Leu Lys Phe Xaa Cys Glu Val Leu Leu Arg lie Gin Thr lie Gin Gly 
3 5 40 4b 



Phe Tyr Arg Cys Thr Arg lie Ser Arg Tyr Lys Gly lie Phe Pro Asp 

1 5 50 55 60 

Phe Cys Gin Ser Gin Cys Met Gly Cys Asn Pro Glu Ser Xaa Met Ala 

65 70 75 80 



20 



Val Pro Ala Leu Val Thr Pro lie Leu Ala His Arg Lys Lys Glu Lys 

BS 90 95 



25 



Gly Met Cys Leu Phe Thr Leu lie lie Ala Pro Thr Arg Cys Thr 
100 105 110 

Tyr Phe Cys Xaa 
115 



30 

(2) INFORMATION FOR SEQ ID NO: 2 3 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 3 amino acids 
35 (B) TYPE: amino acid 

( D ) TO POLOGY : 1 l near 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 236: 

Met Ser Ser Ala Lys lie Val Arg Gin Arg Gly Ala Val Pro Thr Tyr 
40 1 5 10 15 

Tyr Thr Thr Glu Ala Gly Glu lie lie Phe Leu Val Leu Asn Trp Ser 
20 25 30 

45 Leu Ser He Leu His He Val Asp Val Leu Cys Ser Lys Pro Glu Lys 
35 40 45 

Ser Val Thr Glu Asp Ala Ala Ser Gly Leu Ser Gin Arg Met Thr Ala 

50 55 60 

50 

Leu Val Trp Arg Lys Gly Pro Asp Gly Gly Ser Arg Lys Pro He Leu 
65 70 75 80 

Leu Leu Phe Phe Phe Leu Pro Leu He Leu Cys Phe His Ser Phe He 
55 85 90 95 

His Ser Ser Asn He Cys Xaa 
100 

60 
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2) INFORMATION FOR SEQ ID NO: 2 37: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 2 amino acids 
( b ) TY PE : ami no ac i d 
( D ) TO PO LOGY : 1 i ne ar 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 237: 



10 Met lie Leu Phe Pro Gin Xaa Ala Leu Arg Leu Gly Xaa Trp Pro Arg 
15 



10 15 



Thr Trp Ser lie Leu Xaa Lys Tyr Ser Val Asn Phe Phe Ser Ala Tyr 

:o 25 30 

q e r Pro Met Gly Ala Val Gly Thr Giu Phe 
35 '10 



20 



(2) INFORMATION FOR SEQ ID NO: 23 8: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 amino acids 
25 (B) TYPE: amino acid 

(D) TOPOLOGY' ; linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 238: 

M-t lie He Leu Leu Leu Phe Met Leu Leu Asn Asn Val Val Leu Val 
30 15 L0 15 

Gin Glu Asp Asn Cys Glr. Arg Lys Asn Thr Val Gin Glu Arg Arg Xaa 

20 25 30 

35 Trp Ser Gin Tip Xaa 
35 



40 (2) INFORMATION FOR SEQ ID NO: 239: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 12 3 amino acids 
\ B ) TY PE : am i n o acid 
45 i D) TOPOLOGY: linear 

(>:i) GECUENCE DESCRIPTION: SEQ ID NO: 239: 

Me- Ala Ala Xaa Pro Pro Gly Cys Thr Pro Pro Xaa Leu Leu Asp He 
1 5 10 15 

^ 0 ser Trp Leu Thr Glu Ser Leu Gly Ala Gly Gin Pro Val Pro Val Glu 
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65 



7 0 



R0 



Jlu val 



Arg Val Arg Arg 

8 5 



Z-lu Arg Tyr Gin Thr Met: Lys Val 
90 9 5 



Arg 



Ala Gly Leu Gly Pro Thr Pro Gly Met 



100 



105 



Pro Gly 
110 



Asp Asn Thr Val His Thr Met His Gly Glu Ala Asn .Arg Gly Ser Xaa 
115 120 125 



(2) INFORMATION FOR SEQ ID MO: 2 40: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 0: 

Met Ser lie Leu Cys Cys Pro Xaa Leu Cys Leu Phe Phe Ser Phe Cys 
15 10 15 

lie Ser Ser Gly Ser Cys Pro Phe Ser His Val Ser Gin Leu Ser Phe 
2 0 2 5 3 0 

lie Ala Thr Phe Ser Gin Ser Ser Pro Val Leu Leu Val Pro Ala Tyr 
35 40 45 

Al:u Thr Tyr Leu Ser Phe Leu Ala Phe Leu Asp Cys Ala Ser Leu Thr 

50 55 60 

Ser Thr Xaa 

65 



(2) INFORMATION FOR SEQ ID NO: 241: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 9 amino acids 

{ B ) TYPE ; amino ac id 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 241: 

Met Ser Thr Phe Gin Leu Leu Leu Leu lie Leu Ala Gin Ser Thr Tyr 
15 10 15 

Lys lie Lys Ser Lys Pro Leu His Met Thr Asn His Thr Leu Leu Asn 
20 25 30 

Ser Pro Gly Leu Asn Pro Ser Ser Pro Thr Leu Asn Phe Lys Thr Gin 
35 40 45 



Gin His Glu Ser Val Ser Tyr Ala Cys Cys His Met Arg Ser Leu His 
50 55 60 
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His Ala Phe Ala Xaa 
65 



(2) INFORMATION FOR 3E V » ID NO: 242: 

( i ) SEQUENCE CHARACTERISTICS : 
|0 (A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 242: 

IS Met val Ser val VaL Leu He Phe Ser Phe Leu Ser Leu Thr He Ser 

10 15 

Thr Ha Ala Tvr Asn Gly Asn Asp Thr Gin Gly Trp Asn Asp 

20 * 25 30 

20 

Lys Ph- His Xaa Xaa Ser Val Lys Thr Gin Thr Xaa 
35 40 



25 



(2) INFORMATION FOR SEQ ID NO: 243: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 
30 ( b ) TYPE: .ammo acid 

i p ; TO ?C LOGY : linear 
(xi) SEQL.rE.NCE DESCRIPTION: SEQ ID NO : 243: 

Met He ^er As P Ala Giy Ala Gly Phe Gly Val Phe Leu Leu Val Pro 

35 i * 10 15 

Arg Ala Glv Hi, Cvs Tr]) Gly Ala Gly Lys Pro Leu Pro Ser Cys Pro 

' 2- 25 30 

40 Ser Val Ala Ser He Pro Ser Trp Val Leu Pro Ser Phe Leu Glu Arg 
35 40 45 

Giy Arg Xaa 

50 

45 



(2) INFORMATION FOR SEQ ID NO:: 244: 

50 H) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 4 3 amine acids 
( r j ) TY P E : am i n o a c i d 

r \ TOPOLOGY: linear 



60 
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Leu Lvs Pro L c 



(2) infgrmatisn 



10 (A) L^z;—-:: 51 =rii_r_3 acids 

15 Met lie Leu Met Pro Gly Leu Sly Thr Ser Gin Arg Ser Val Pro 

i 5 is 

Fhe Val Pro Thr Leu -m .-Jd Air Thr Pro Gly A. la Met Thr Gly Pro 
2 0 2 £ 3 0 



Thr Ala Thr Leu Thr Ser Ays Sir. Trr Thr Thr Aha Cys Arg Val Ser 

35 ;: 45 

Trp Ala Asn Gly Trp Trr Ser Leu Arr Thr Pre Arg 7_aa 

5 0 5 5 5 0 



(2) INFORMATION FSR S550 TD NO: ^4 6: 

30 

Met Ser His His Ala Sir Pr^o At Pre Leu Leu He Thr Met Leu Leu 
1 5 11 15 

40 Gin Glu Ala Lys Pro Va_ Ser Aj=r. He Pro His Leu Leu Glu Ser Trp 
20 25 30 

Tyr Phe Gly Xaa 

35 

45 



50 



55 



(2) INFORMATION FOR OTA ID NC : 247: 

(i) SEQUENCE GOOASAAC I — S OAT 1 00 

(3) TOPE : amino icy: 
(D) T0P11OGY: linear 



(xi) 



Met Asn Ser Leu Phe _rp Met 
1 5 



;ssj: nog id :;o: 247 : 

_e Leu Leu Pro Val Ser Gin Asp Gin 
10 15 



60 



Val Val Glu Gly Leu Gin Gly Sly Phe Ser Gin -le His Met Arg He 

20 21 30 
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Leu Arg Lys His Leu Xaa 
5 



(2) INFORMATION FCR SEQ ID NO : 2 48: 

(1) SEQUENCE CHARACTERISTICS: 
JO (A) LENGTH: 211 amino acids 

(3) TYPE: amino acid 
( D) TOPOLOGY : linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 243: 

15 M-t Ser Arg Ser Xaa Asp Val Thr Asn Thr Thr Phe Leu Leu Met Ala 
1 b 10 IS 

Ala Ser He Tyr Leu His Asp Gin Asn Pro Asp Ala Ala Leu Arg Ala 
20 25 30 



20 



Leu His Gin Gly Asp Ser Leu Glu Cys Thr Ala Met Thr Val Gin He 
IS 40 4 S 



Leu Leu Lys Leu Asp Arg Leu Asp Leu Ala Arg Lys Glu Leu Lys Arg 
25 50 55 60 

Met Gin Asp Leu Asp Glu Asp Ala Thr Leu Thr Gin Leu Ala Thx Ala 
65 70 75 80 

10 Trr> Val Ser Leu Ala Thr Gly Gly Glu Lys Leu Gin Asp Ala Tyr Tyr 

85 90 95 

, , , . . -r — - o ^- n>-^ Thr T.^ii T.f^n L*^n Leu 

He Phe Gin Glu Met Aia asp uy^ ^ya — - 



100 



105 HO 



35 



Asn Gly Gin Ala Ala Cys His Met Ala Gin Gly Arg Trp Glu Ala Ala 
115 120 125 



Glu Gly Leu Leu Gin Glu Ala Leu Asp Lys Asp Ser Gly Tyr Pro Glu 
40 130 135 1^0 

T^r Lei i va- ^sn Leu He Val Leu Ser Gin His Leu Gly Lys Pro Pro 
145 ISO 155 160 

4^ Glu Val Thr Asn Arg Tyr Leu Ser Gin Leu Lys Asp Ala His Arg Ser 

165 I 70 175 

His Pro Phe He Lys Glu Tyr Gin Ala Lys Glu Asn Asp Phe Asp Arg 
180 IBS 190 

50 

Leu Val Leu Gin Tyr Ala Pro Ser Ala Glu Ala Gly Pro Glu Leu Ser 

1 9 ^ 2 00 2 0 5 



60 
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( i ) SEQUENCE CHARACTERISTICS : 

( A ) LENGTH: 54 8 amino acids 

(B) TYPE : amino acid 
(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SFQ ID NO: 24 9: 

Met Glu Asp Ser Glu Ala Leu Gly Phe Glu His Met Gly Leu Asp Pro 
15 10 15 

Arg Leu Leu Gin Ala Val Thx Asp Leu Gly Trp Ser Arg Pro Thr Leu 

20 25 30 

lie Gin Glu Lys Ala lie Pro Leu Ala Leu Glu Gly Lys Asp Leu Leu 
35 40 45 

Ala Arg Ala Arg Thr Gly Ser Gly Lys Thr Ala Ala Tyr Ala lie Pro 
50 55 60 

Met Leu Gin Leu Leu Leu His Arg Lys Ala Thr Gly Pro Val Val Glu 
65 70 75 80 

Gin Ala Val Arg Gly Leu Val Leu Val Pro Thr Lys Glu Leu Ala Arg 

85 90 95 

Gin Ala Gin Ser Met lie Gin Gin Leu Ala Thr Tyr Cys Ala Arg Asp 
100 105 110 

Val Arg Val Ala Asn Val Ser Ala Ala Glu Asp Ser Val Ser Gin Arg 
115 120 125 

Ala Val Leu Met Glu Lys Pro Asp Val Val Val Gly Thr Pro Ser Arg 
130 135 140 

lie Leu Ser His Leu Gin Gin Asp Ser Leu Lys Leu Arg Asp Ser Leu 
145 150 155 150 

Glu Leu Leu Val Val Asp Glu Ala Asp Leu Leu Phe Ser Phe Gly Phe 
165 * 170 175 

Glu Glu Glu Leu Lys Ser Leu Leu Cys His Leu Pro Arg lie Tyr Gin 
180 185 190 

Ala Phe Leu Met Ser Ala Thr Phe Asn Glu Asp Val Gin Ala Leu Lys 

195 200 205 

Glu Leu lie Leu His Asn Pro Val Thr Leu Lys Leu Gin Glu Ser Gin 
210 215 220 

Leu Pro Gly Pro Asp Gin Leu Gin Gin Phe Gin Val Val Cys Glu Thr 
225 230 235 240 

Glu Glu Asp Lys Phe Leu Leu Leu Tyr Ala Leu Leu Lys Leu Ser Leu 
245 250 255 

lie Arg Gly Lys Ser Leu Leu Phe Val Asn Thr Leu Glu Arg Ser Tyr 
260 265 270 



Arg Leu Arg Leu Phe Leu Glu Gin Phe Ser lie Pro Thr Cys Val Leu 
275 280 285 
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Asn Gly Glu Leu Pro Leu Arg Ser Arg Cys His He He Ser Gin Phe 
290 295 300 

^sn Gin Gly Phe Tyr Asp Cys Val He Ala Thr Asp Ala Glu Val Leu 

5 305 310 315 320 

Gly Ala Pro Val Lys Gly Lys .Arg Arg Gly Arg Gly Pro Lys Gly Asp 
325 330 335 

10 Lvs Ala Ser Asp Pro Glu Ala Gly Val Ala Arg Gly He Asp Phe His 
340 345 350 

His Val Ser Ala Val Leu Asn Phe Asp Leu Pro Pro Thr Pro Glu Ala 
355 360 365 

15 

Tyr He His Arg Ala Gly Arg Thr Ala Arg Ala Asn Asn Pro Gly He 
370 375 380 

V*l Leu Thr Phe Val Leu Pro Thr Glu Gin Phe His Leu Gly Lys He 
20 385 390 395 400 

Glu Glu Leu Leu Ser Gly Glu Asn Arg Gly Pro He Leu Leu Pro Tyr 
405 410 415 

25 Gin Phe Arg Met Glu Glu He Glu Gly Phe Arg Tyr Arg Cys Arg Asp 
420 425 430 

Ala Met Arg Ser Val Thr Lys Gin Ala He Arg Glu Ala Arg Leu Lys 
435 440 445 

30 

Glu He Lys Glu Glu Leu Leu His Ser Glu Lys Leu Lys Thr Tyr Phe 
450 " 455 460 

Glu Aso Asn Pro Arg Asp Leu Gin Leu Leu Arg His Asp Leu Pro Leu 
35 465 " 470 475 480 

His Pro Ala Val Val Lys P^o His Leu Gly His Val Pro Asp Tyr Leu 
485 490 495 

40 Val Pro Pro Ala Leu Arg Gly Leu Val Arg Pro His Lys Lys Arg Lys 
500 505 510 

L"3 Leu ^r S^r Ser Cys Arg Lys Ala Lys Arg Ala Lys Ser Gin Asn 
515 520 ^35 

45 

Pro Leu Arg Ser Phe Lys His Lys Gly Lys Lys Phe Arg Pro Trrr Ala 
530 535 ^40 

Lys Pro Ser Xaa 
50 545 



WO 98/54963 PCT/US98/11422 



496 



10 



20 



25 



40 



Met Thr Thr Val Pro Pro 3er Pro Arg PrD Met Ser Arg Pro Ser Glu 
1 5 10 IS 

Arg Asn Met Arg Arg Pro Arg Gly Pro Ser Pro Leu Pro Ala Ser Pro 

20 25 30 

Arg Asn Ser Thr Pro Asp Glu Pro Asp Val His Phe Ser Lys Lys Phe 
35 40 45 

Leu Asn Val Phe Met Ser Gly Arg Ser Arg Ser Ser Ser Ala Glu Ser 
SO 55 60 



Phe Gly Leu Phe Ser Cys lie lie Asn Gly Glu Glu Gin Glu Gin Thr 
15 65 70 75 80 



His Arg Ala lie Phe Arg Phe Val Pro Arg His Glu Asp Glu Leu Glu 
35 90 95 

Leu Glu Val Asp Asp Pro Leu Leu Val Glu Leu Gin Ala Glu Asp Tyr 
100 105 110 

Trp Tyr Glu Ala Tyr Asn Met Arg Thr Gly Ala Arg Gly Val Phe Pro 
115 120 125 

Ala Tyr Tyr Ala lie Glu Val Thr Lys Glu Pro Glu His Met Ala Ala 
130 135 140 



Leu Ala Lys Asn Ser Asp Trp Val Asp Gin Phe .Arg Val Lys Phe Leu 
30 145 150 155 160 

Gly Ser Val Gin Val Pro Tyr His Lys Gly Asn Asp Val Leu Cys Ala 

165 170 175 

35 Ala Met Gin Lys lie Ala Thr Thr Arg Arg Leu Thr Val His Phe Asn 
180 185 190 



Pro Pro Ser Ser Cys Val Leu Glu lie Ser Val Arg Gly Val Lys lie 
135 200 205 

Gly Val Lys Ala Asp Asp Ser Gin Glu Ala Lys Gly Asn Lys Cys Ser 

210 215 220 



His Phe Phe Gin Leu Lys Asn lie Ser Phe Cys Gly Tyr His Pro Lys 

4 5 225 230 235 240 

Asn Asn Lys Tyr Phe Gly Phe He Thr Lys His Pro Ala Asp His Arg 

245 250 255 

50 Phe Ala Cys His Val Phe Val Ser Glu Asp Ser Thr Lys Ala Leu Ala 

260 265 270 



Glu Ser Val Gly Arg Ala Phe Gin Gin Phe Tyr Lys Gin Phe Val Glu 

275 280 285 

Tyr Thr Cys Pro Thr Glu Asp He Tyr Leu Glu 
290 295 



60 
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(2) INFORMATION FOR SEQ ID NO: 251: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 40 amino acids 
5 (3) TYPE: amino acid 

( D ) TOPOLOGY : 1 inear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 251: 

Leu Leu Tyr Leu Leu Lys Val Xaa Val He Phe Val Phe Ser Ser Ser 
10 1 ' 5 10 15 

Lys Gly Val Thr Leu Val Ser Met Asn Leu Thr Ser Phe Phe Val Ser 
20 25 30 

15 Ser Val Leu Ala Cys Phe Ser Xaa 

3 5 40 



IK) (2) INFORMATION FOR SEQ ID NO: 252: 

(l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 594 amino acids 

(B) TYPE: amino acid 
25 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 252: 



30 



45 



Met Pro Ala Ser Ser Leu Glu Ser Arg Ser Phe Leu Leu Ala Lys Lys 
15 10 15 

Ser Gly Glu Asn Val Ala Lys Phe He He Asn Ser Tyr Pro Lys Tyr 
20 25 30 



Phe Gin Lys Asp He Ala Glu Pro His He Pro Cys Leu Met Pro Glu 

35 35 40 45 

Tyr Phe Glu Pro Gin lie Lys Asp He Ser Glu Ala Ala Leu Lys Glu 
50 55 60 

40 Arg He Glu Leu Arg Lys Val Lys Ala Ser Val Asp Met Phe Asp Gin 

65 70 75 80 



Leu Leu Gin Ala Gly Thr Thr Val Ser Leu Glu Thr Thr Asn Ser Leu 

35 90 95 

Leu Asp Xaa Leu Cys Tyr Tyr Gly Asp Gin Glu Pro Ser Thr Asp Tyr 
100 105 HO 



His Phe Gin Gin Thr Gly Gin Ser Glu Ala Leu Glu Glu Glu Asn Asp 

50 115 120 125 

Glu Thr Ser .Arg Arg Lys Ala Gly His Gin Phe Gly Val Thr Trp Arg 
130 135 140 



60 
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Tyr Glu Gin Ala Leu Asn Leu Tyr Thr Glu Leu Leu Asn Asn Arg Leu 
180 135 190 

His Ala Asp Val Tyr Thr Phe Asn Ala Leu lie Glu Ala Thr Val Cys 
5 195 200 205 

Ala lie Asn Glu Lys Phe Glu Glu Lys Trp Ser Lys lie Leu Glu Leu 
210 215 220 

10 Leu Arg His Met Val Ala Gin Lys Val Lys Pro Asn Leu Gin Thr Phe 
225 230 235 240 



15 



30 



45 



Asn Thr lie Leu Lys Cys Leu Arg Arg Phe His Val Phe Ala Arg Ser 
245 250 255 

Pro Ala Leu Gin Val Leu Arg Glu Met Lys Ala lie Gly Tie Glu Pro 

260 265 270 



Ser Leu Ala Thr Tyr His His lie lie Arg Leu Phe Asp Gin Pro Gly 
20 275 280 285 

Asp Pro Leu Lys Arg Ser Ser Phe lie lie Tyr Asp lie Met Asn Glu 
290 295 300 

25 Leu Met Gly Lys Arg Phe Ser Pro Lys Asp Pro Asp Asp Asp Lys Phe 
305 310 315 320 

Phe Gin Ser Ala Met Ser lie Cys Ser Ser Leu Arg Asp Leu Glu Leu 

325 330 335 



Ala Tyr Gin Val His Gly Leu Leu Lys Thr Gly Asp Asn Trp Lys Phe 
340 345 350 



lie Gly Pro Asp Gin His Arg Asn Phe Tyr Tyr Ser Lys Phe Phe Asp 
35 355 360 365 

Leu lie Cys Leu Met Glu Gin lie Asp Val Thr Leu Lys Trp Tyr Glu 
370 375 330 

40 Asp Leu He Pro Ser Ala Tyr Phe Pro His Ser Gin Thr Met He His 

385 390 395 400 



Leu Leu Gin Ala Leu Asp Val Ala Asn Arg Leu Glu Val He Pro Lys 
405 410 415 

He Trp Lys Asp Ser Lys Glu Tyr Gly His Thr Phe Arg Ser Asp Leu 

420 425 430 



Arg Glu Glu He Leu Met Leu Met Ala Arg Asp Lys His Pro Pro Glu 
50 435 440 445 

Leu Gin Val Ala Phe Ala Asp Cys Ala Ala Asp He Lys Ser Ala Tyr 
450 455 460 

55 Glu Ser Gin Pro He Arg Gin Thr Ala Gin Asp Trp Pro Ala Thr Ser 

465 470 475 480 



60 



Leu Asn Cys He Ala He Leu Phe Leu Arg Ala Gly Arg Thr Gin Glu. 

485 490 495 
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